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COMPOSITIONS AND METHODS FOR HYDROXYLATING 

EPOTHILONES 

5 

Field of the Invention 

The present invention relates to isolated nucleic acids sequences and 
polypeptides encoded thereby for epothilone B hydroxylase and mutants and variants 
thereof, and a ferredoxin located downstream from the epothilone B hydroxylase 

10 gene. The present invention also relates to recombinant microorganisms expressing 
epothilone B hydroxylase or a mutant or variant thereof and/or ferredoxin which are 
capable of hydroxylating small organic molecule compounds, such as epothilones, 
having a terminal alkyl group to produce compounds having a terminal hydroxyalkyl 
group. Also provided are methods for recombinantly producing such microorganisms 

15 as well as methods for using these recombinant microorganisnos in the synthesis of 
compounds having a terminal hydroxylalkyl group. The compositions and methods 
of the present invention are useful in preparation of epothilones having a variety of 
utilities in the pharmaceutical field. A novel epothilone analog produced using a 
mutant of epothilone B hydroxylase of the present mvention is also described, 

20 

Backgroond of the Invention 

Epothilones are macrolide compounds that find utility in the pharmaceutical 



field. For example, epothilones A and B having the structures: 




Epothilone B R=Me 

have been foxmd to exert microtubule-stabilizing effects sinwlar to paclitaxel 
(TAXOL®) and hence cytotoxic activity against rapidly proliferating cells, such as. 
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tumor cells or cells associated with other hyperproliferative cellular diseases, see 
BoUag et al. Cancer Res .. Vol. 55, No. 11, 2325-2333 (1995). 

Epothilones A and B are natural anticancer agents produced by Sorangium 
ceUulosum that were first isolated and characterized by Hofle et al., DE 4138042; WO 
5 93/10121; Aneew. Chem. Int. Ed. End . Vol. 35, Nol3/14, 1567-1569 (1996); and L 
Antibiot . Vol. 49, No. 6. 560-563 (1996). Subsequently, the total syntheses of 
epothilones A and B have been published by Balog et aL, Angew. Chenu Int. Ed. 
Engl ., Vol. 35, No. 23/24, 2801-2803, 1996; Meng et al, J. Am. Chem. Soc . Vol. 
119, No. 42, 10073-10092 (1997); Nicolaou etal, J. Am. ChenL Soc „ VoL 119, No. 

10 34, 7974-7991 (1997); Schinzer et al., Angew. Chem. Int. Ed. Eng .. VoL 36, No. 5, 
523-524 (1997); and Yang et al., Angew. Chem. Int Ed. Engl„ Vol. 36, No. 1 / 2, 
166-168, 1997. WO 98i^5929 disclosed the methods for chemical synthesis of 
epothilone A, epothilone B, analogs of epothilone and libraries of epothilone analogs. 
The structure and production from Sorangium cellulosum DSM 6773 of epothilones 

15 C, D, E, and F was disclosed in WO 98/22461. Figure 1 provides a diagram of the 
biotransformation as described in WO 00/39276 of epothilone B to epothilone F in 
Actinomycetes species strain SC15847 (ATCC PT-1043), subsequently identified as 
Amycolatopsis orientalis. 

Cytochrome P450 enzymes are found in prokaryotes and eukaryotic cells and 

20 have in common a heme binding domain which can be distinguished by an 

absorbance peak at 450 nm when complexed with carbon monoxide. Cytochrome 
P450 enzymes perform a broad spectrum of oxidative reactions on primarily 
hydrophobic substrates including aromatic and benzylic rings, and alkanes. In 
prokaryotes they are found as detoxifying systems and as a first enzymatic step in 

25 metabolizing substrates such as toluene, benzene and camphor. Cytochrome P450 
genes have also been found in biosynthetic pathways of secondary metabolites such as 
nikkomycin in Streptomyces tendae (Bruntner, C. et al, 1999, Mol. Gen. Genet. 262: 
102-114), doxorubicin (Dickens, M.L, Strohl, WJR., 1996, J. Bacteriol, 178: 3389- 
3395) and in the epothilone biosynthetic cluster of Sorangium cellulosum (Julien, B. 

30 et al., 2000, Gene, 249: 153-160). With a few exceptions, the cytochrome P450 
systems in prokaryotes are composed of three proteins; a ferredoxin NADH or 
NADPH dependent reductase, an iron-sulfur ferredoxin and the cytochrome P450 
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enzyme (Lewis, D.F., fflavica, P., 2000, Biochim. Biophys. Acta., 1460: 353-374). 
Electrons are transferred from ferredoxin reductase to the ferredoxin and finally to the 
cytochrome P450 enzyme for the splitting of molecular oxygen. 

5 Summary of the Invention 

An object of the present invention is to provide isolated nucleic acid sequences 
encoding epothilone B hydroxylase and variants or mutants thereof and isolated 
nucleic acid sequences encoding ferredoxin or variants or mutants thereof. 

Another object of the present invention is to provide isolated polypeptides 
0 comprising amino acid sequences of epothilone B hydroxylase and variants or 
mutants thereof and isolated polypeptides comprising amino acid sequences of 
ferredoxin and variants or mutants thereof. 

Another object of the present invention is to provide stmcture coordinates of 
the homology model of the epothilone B hydroxylase. The structure coordinates are 
5 listed ui Appendix 1 . This model of the present invention provides a means for 
designing modulators of a biological function of epothilone B hydroxylase as well as 
additional mutants of epothilone B hydroxylase with altered specificities. 

Another object of the present invention is to provide vectors comprising 
nucleic acid sequences encoding epothilone B hydroxylase or a variant or mutant 
0 thereof and/or ferredoxin or a variant or mutant thereof. In a preferred embodiment, 
these vectors further comprise a nucleic acid sequence encoding ferredoxin. 

Another object of the present invention is to provide host cells comprising a 
vector containing a nucleic acid sequence encoding epothilone B hydroxylase or a 
variant or mutant thereof and/or ferredoxin or a variant or mutant thereof. 
5 Another object of the present invention is to provide a method for producing 

recombinant microorganisms that are capable of hydroxylating compounds, and in 
particular epothilones, having a terminal alkyl group to produce compounds having a 
terminal hydroxyalkyl group. 

Another object of the present invention is to provide microorganisms produced 
} recombinanfly which are capable of hydroxylating compounds, and in particular 
epothilones, having a terminal alkyl group to produce compounds having a terminal 
hydroxyalkyl group. 
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Another object of the present invention is to provide methods for 
hydroxylating compounds in these recombinant microorganisms. In particular, the 
present invention provides a method for the preparation of hydroxyalkyl-bearing 
epothilones, which compounds find utility as antitumor agents and as startmg 
5 materials in the preparation of other epothilone analogs. 

Yet another object of the present invention is to provide a compound of 
Formula A: 




O OH O 

10 referred to herein as 24-OH epothilone B or 24-OH EpoB, as well as compositions 
and methods for production of compositions comprising the compound of Formula A. 



Brief Description of the Figures 

Kgure 1 provides a schematic of the biotransformation as set forth in WO 
15 00/39276, U.S. Apphcation Serial Na 09/468,854, filed December 21, 1999, of 

epothilone B to epothilone F by Amycolatopsis orientalis strain SC15847 (PTA1043). 

Figure 2 shows the nucleic acid sequence alignments of SEQ ID N0:5 through 
SEQ ID NO:22 used to design the PGR primers for cloning of the nucleic acid 
sequence encoding epothilone B hydroxylase. 
20 Figure 3 shows the sequence alignment between epothilone B hydroxylase 

(SEQ ID N0:2) and EryF (PDB code 1 JIN cham A; SEQ ID NO:76). The asterisks 
indicate sequence identities, the colons (:) similar residues. 

Figure 4 provides a homology model of epothilone B hydroxylase based upon 
sequence alignment with EryF as shown in Figure 3. 
25 Figure 5 shows an energy plot of the epothilone B hydroxylase model 

(indicated by dashed line) relative to EryF (PDB code 1 JIN; indicated by solid line). 
An averaging window size of 51 residues was used, i.e., the energy at a given residue 
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position is calculated as the average of the energies of the 51 residues in the sequence 
that lie with the given residue at the central positions. 

Detailed Description of the Invention 



The present invention relates to isolated nucleic acid sequences and 
polypeptides and methods for obtaining compounds with desired substituents at a 
terminal carbon position. In particular, the present invention provides compositions 
and methods for the prq)aration of hydioxyalkyl-bearing epothilones, which 

10 compounds find utility as antitumor agents and as starting materials in the preparation 
of other epothilone analogs. 

The terai "epothilone/' as used herein, denotes compounds containing an 
epothilone core and a side chain group as defined herein. The term "epothilone core," 
as used herein, denotes a moiety containing the core stracture (with the numbering of 

15 ring system positions used herein shown): 



5 




O 



X 



wherein the substituents are as follows: 



Q is selected from the group consisting of 



20 




and 



WisOorNRe; 

X is selected from the group consisting of O, H and OR7; 
Mis O, S, NRg, CR9R10; 

Bi and B2 are selected from the group consisting of ORn, OCOR12; 
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R1-R5 and R12-R17 arc selected from the group consisting of H, alkyl, 
substituted alkyl, aryl, and heterocyclo, and wherein Ri and R2 are alkyl they can be 
joined to form a cycloalkyl; 

Ra is selected from the group consisting of H, alkyl, and substituted alkyl; 
5 R7 and Rii are selected from the group consisting of H, alkyl, substituted 

alkyl, tdalkylsilyl, alkyldiarylsilyl and dialkylarylsilyl; 

Rg is selected from the group consisting of H, alkyl, substituted alkyl, Ri3C=0, 
Ri40C=0 and R15SO2; and 

R9 and Rio are selected from the group consisting of H, halogen, alkyl, 
10 substituted alkyl, aryl, heterocyclo, hydroxy, Ri6C=0, and R17O&O. 

The term "side chain group" refers to substituent G as defined above for 
Epothilone A or B or Gi and G2 as shown below. 

Gi is the following formula V 

HO-CH2-(Ai)n-(Q)m-(A2)o (V), 

15 and 

G2 is the following formula VI 

CH3-(AiMQ)ni-(A2)o (VI), 

where 

Ai and A2 are independently selected from the group of optionally substituted 
20 C1-C3 alkyl and alkenyl; 

Q is optionally substituted ring system containing one to three rings and at 
least one carbon to carbon double bond in at least one ring; and 

n, m, and o are integers independently selected from the group consisting of 
zero and 1, where at least one of m, n or o is L 
25 The term **terminal carbon" or "terminal alkyl group" refers to the terminal 

carbon or terminal methyl group of the moiety eiflier directly bonded to the epothilone 
core at position IS or to the terminal caibon or terminal alkyl group of the side chain 
group bonded at position IS. It is understood that the term "alkyl group" includes 
alkyl and substituted alkyl as defined herein. 
30 The term "alkyl" refers to optionally substituted, straight or branched chain 

saturated hydrocarbon groups of 1 to 20 carbon atoms, preferably 1 to 7 carbon atoms. 
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The expression "lower alkyl" refers to optionally substituted alkyl groups of 1 to 4 
carbon atoms. 

The term "substituted alkyl" refers to an alkyl group substituted by, for 
example, one to four substituents, such as, halo, trifluoromethyl, trifluoiomethoxy, 
5 hydroxy, alkoxy, cycloalkyloxy, heterocyclooxy, oxo, alkanoyl, aryloxy, alkanoyloxy, 
ainino, alkylamino, arylanoino, aralkylamino, cycloalkylamino, heterocycloamino, 
disubstituted amines in which the 2 amino substituents are selected fix>m alkyl, aryl or 
aralkyl, alkanoylamino, aroylamino, aralkanoylamino, substituted alkanoylamino, 
substituted arylamino, substituted aralkanoylandno, thiol, alkylthio, arylthio, 

10 aralkylthio, cycloalkylthio, heterocyclothio, alkylthiono, arylthiono, aralkylthiono, 
alkylsulfonyl, arylsulfonyl, aralkylsulfonyl, sulfonamide (e.g. SO2NH2), substituted 
sulfonamido, nitro, cyano, carboxy, carbamyl (e.g. CONH2), substituted carbamyl 
(e.g. CONH alkyl, CONH aryl, CONH aralkyl or cases where there are two 
substituents on the nitrogen selected from alkyl, aryl or aralkyl), alkoxycarbonyl, aryl, 

15 substituted aryl, guanidino and heterocyclos, such as, indolyl, inaidazolyl, furyl, 
thienyl, thiazolyl, pyrrolidyl, pyridyl, pyrimidyl and the like. Where noted above 
where the substituent is further substituted it will be with halogen, alkyl, alkoxy, aryl 
or aralkyl. 

In accordance with one aspect of the present invention there are provided 
20 isolated polynucleotides that encode epothilone B hydroxylase, an enzyme capable of 
hydroxylating epothilones having a terminal alkyl group to produce epothilones 
having a terminal hydroxyalkyl group. 

In accordance with another aspect of the present invention there are provided 
isolated polynucleotides that encode a ferredoxin, the gene for which is located 
25 downstream from the epothilone B hydroxylase gene. Ferredoxin is a protein of the 
cytochrome P450 system. 

By "polynucleotides", as used herein, it is meant to include any form of DNA 
or RNA such as cDNA or genomic DNA or niElNA, respectively, encoding these 
enzymes or an active fragment thereof which are obtained by cloning or produced 
30 synthetically by well known chemical techniques. DNA may be double- or single- 
stranded. Single-stranded DNA may comprise the coding or sense strand or the non- 
coding or antisense strand. Thus, the term polynucleotide also includes 
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polynucleotides exhibiting at least 60% or more, preferably at least 80%, homology to 
sequences disclosed herein, and which hybridize under stringent conditions to the 
above-described polynucleotides. As used herein, the term "stringent conditions** 
means hybridization conditions of 60°C at 2xSSC buffer. More preferred are isolated 

5 nucleic acid molecules capable of hybridizing to the nucleic acid sequence set forth in 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72, or 74 or SEQ ID 
NO:3, or to the complementary sequence of the nucleic acid sequence set forth in 
SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62 ,64, 66, 68, 70, 72 ,or 74 
or SEQ ID NO:3, under hybridization conditions of 3X SSC at 65*^0 for 16 hours, 

ID and which are capable of remaining hybridized to the nucleic acid sequence set forth 
in SEQ ID NO:l, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 70, 72 or 74 
or SEQ ID NO:3, or to the complementary sequence of the nucleic acid sequence set 
forth in SEQ ID NO: 1, 30, 32, 34, 36, 37, 38, 39, 40, 41 or 42, 60, 62, 64, 66, 68, 70, 
72 or 74 or SEQ ID NO:3, under wash conditions of 0.5X SSC, 55°C for 30 minutes. 

15 In one embodiment, a polynucleotide of the present invention comprises the 

genomic DNA depicted in SEQ ID NO: 1 or a homologous sequence or fragment 
thereof which encodes a polypeptide having similar activity to that of this epothilone 
B hydroxylase. Alternatively, a polynucleotide of the present invention may comprise 
the genomic DNA depicted m SEQ ID N0:3 or a homologous sequence or fragment 

20 thereof which encodes a polypeptide having similar activity to this ferredoxin. Due to 
the degeneracy of the genetic code, polynucleotides of the present invention may also 
comprise other nucleic acid sequences encoding this enzyme and derivatives, variants 
or active fragments thereof. 

The present invention also relates to variants of these polynucleotides which 

25 may be naturally occurring, i.e., present in microorganisms such as Amycolatopsis 
orientalis and Amycolata autotrophica, or in soil or other sources from which nucleic 
acids can be isolated, or mutants prepared by well known mutagenesis techniques. 
Exemplary variants polynucleotides of the present invention are depicted in SEQ ID 
NO: 3642. 

30 By '^mutants" as used herein it is meant to be inclusive of nucleic acid 

sequences with one or more point mutations, or deletions or additions of nucleic acids 
as compared to SEQ ID NO: 1 or 3, but which still encode a polypeptide or fragment 
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with similar activity to the polypeptides encoded by SEQ ID NO: 1 or 3. In a 
preferred embodiment, mutations are made which alter the substrate specificity and/or 
yield of the enzyme. A preferred region of mutation with respect to the epothilone B 
hydroxylase gene is that region of the nucleic acid sequence coding for the 
5 approximately 1 13 amino adds residues comprising the active site of the enzyme. 
Also preferred are mutants encoding a polypeptide with at least one amino acid 
substitution at amino acid position GLU31, ARG67, ARG88, EJE92, ALA93, 
VAL106, ILE130, ALA140, MET176, PHE190, GLU 231, SER294, PHE237, or 
1LE365 of SEQ ID NO: I. Exemplary polynucleotide mutants of the present invention 
10 are depicted in SEQ ID NO: 30, 32, 34, 60, 62, 64, 66, 68, 70, 72 and 74. 

Qoning of the nucleic acid sequence of SEQ ID NO:l encoding epothilone B 
hydroxylase was performed using PGR primers designed by aligning the nucleic acid 
sequences of six cytochrome P450 genes from bacteria. The following cytochrome 
P450 genes were aligned: 
15 Sequence 1: Locus: STMSUACB; Accession number: M32238; Reference: 
Omer, C.A., J. BacterioL 172: 3335-3345 (1990) 
Sequence 2: Locus: STMSUBCB; Accession number: M32239; Reference: 

Omer, C.A., J. BacterioL 172: 3335-3345 (1990) 
Sequence 3: Locus: AB018074 (formerly STMORFA); Accession number: 
20 AB018074; Reference: Ueda, K., J. Antibiot. 48: 638-646 (1995) 

Sequence 4: Locus: SSU65940; Accession number: U65940; Reference: 

Motamedi, H., J. BacterioL 178: 5243-5248 (1996) 
Sequence 5: Locus: STMOLEP; Accession number: L37200; Reference: 

Rodriguez, AM., FEMS Microbiol. Lett. 127: 117-120 (1995) 
25 Sequence 6: Locus: SERCP450A; Accession number: M83 1 10; Reference: 
Andersen, J.F. and Hutchinson, C.R., J. BacterioL 174: 725-735 
(1992) 

Alignments were performed using an implementation of the algorithm of 
Myers, E.W. and W. Miller. 1988. CABIOS 4:1, 11-17., the Align program from 
30 Scientific and Educational Software (Durham, North Carolina, USA). Three highly . 
conserved regions were identified in the I-helix, containing the oxygen binding 
doniain, in the K-helix, and spanning the B-bulge and L-helix containing the 
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conserved heme binding domain. Primers were designed to the three conserved 
regions identified in the alignment. Primers P450-l"^ (SEQ ID NO:23) and P450-la'*' 
(SEQ ID NO:24) were designed from the I helix. Primer P450-2'' (SEQ ID NO:25) 
was designed from the B-Bulge and L-helix region and Primer P450-3'( SEQ ID 
5 NO:27) was designed as the reverse complement to the heme binding protein. 

Genomic fragments were then amplified via polymerase chain reaction (PGR). 
After PGR amplification, the reaction products were separated by gel electrophoresis 
and fragments of the expected size were excised. The DNA was extracted from the 
agarose gel slices using the Qiaquick gel extraction procedure (Qiagen, Santa Qarita, 

10 California, USA). The fragments were then cloned into the PCRscript vector 
(Stratagene, La JoUa, California, USA) using the PCRscript Amp cloning kit 
(Stratagene). Colonies containing inserts were picked to 1-2 ml of LB broth with 100 
|Xg/ml ampicillin, SO-ST'^C, 16-24 hours, 230-300 rpm. Plasmid isolation was 
performed using the Mo Bio miniplasmid prep kit (Mo Bio, Solano Beach, California, 

15 USA). This plasmid DNA was used as a PCR and sequencing template and for 
restriction digest analysis. 

The cloned PCR products were sequenced using the Big-Dye sequencing kit 
from Applied Biosystems, (Foster City, California, USA) and were analyzed using the 
ABB 10 sequencer (Applied Biosystems, Foster City, California, USA), The sequence 

20 of the inserts was used to perform a TblastX search, using the protocol of Altschul, 
S.F, et at., MoL BioL 215:403-410 (1990), of the non-redundant protein database. 
Unique sequences having a significant similarity to known cytochrome P450 proteins 
were retained. Using this approach, a total of nine different P450 sequences were 
identified from SC15847, seven from the genomic DNA template and two from the 

25 cDNA. Two P450 sequences were found in common between the DNA and cDNA 
templates. Of the fifty cDNA clones analyzed, two sequences were predominant, 
with twenty clones each. These two genes were then cloned from the genomic DNA. 

The nucleic acid sequence of the genomic DNA was determined using the 
Big-Dye sequencing system (^plied Biosystems) and analyzed using an ABB 10 

30 sequencer. This sequence is depicted in SEQ ID NO: 1. An open reading fi^ime 

coding for a protein of 404 amino acids and a predicted molecular weight of 44.7 kDa 
was found within the cloned BgUI fragment. The deduced amino acid sequence of 
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thiis polypeptide is depicted in SEQ ID NO: 2. The amino acid sequence of this 
polypeptide was found to share 51% identity with the NikF protein of Streptomyces 
ferulae (Bmntner, C. et al, 1999, Mol. Gen, Genet. 262: 102-114) and 48% identity 
with the Sca-2 protein ofS. carbophilus (Watanabe, 1. Et al, 1995, Gene 163: 81-85). 
5 Both of these enzymes belong to the cytochrome P450 family 105. The invariable 
cysteine found in the heme-binding domain of all cytochrome P450 enzymes is found 
at residue 356. This gene for epothilone B hydroxylase has been named ebh. The 
ATG start codon of a putative ferredoxin gene of 64 amino acids is found nine 
basepairs downstream from the stop codon of ebh. This enzyme was found to share 
10 50% identity with ferredoxin genes of S. griseoulus (O'iKeefe, D.P., et al, 1991, 
Biochemistry 30: 447-455) and S. noursei (Brautaset, T., et al, 2000, ChenL Biol. 7: 
395-403). The nucleic acid sequence encoding this ferredoxin is depicted in SEQ ID 
N0:3 and the amino acid sequence for this ferredoxin polypeptide is depicted in SEQ 
IDN0:4. 

15 The ehh gene sequence was also used to isolate variant cytochrome P450 

genes from other microorganisms. Exemplary variant polynucleotides ebhA3A9\, 
€M14930, 6fc/i53630, ^M53550, eM39444, e&M3333 and ebKi5l65 of die present 
invention and the species from which they were isolated are depicted in Table 1 
below. The nucleic acid sequences for these variants are depicted in SEQ ID NO:36- 

20 42, respectively. 

Table 1: Variant polynucleotides 



ATCCID 


Species 


ehh gene designation 


43491 


AmvcoIaUmsis orhntalis 


eM43491 


14930 


AmvcoIatOBsis orientaUs 


e6A14930 


53630 


AmvcolatOBSis oneniaUs 


eM53630 


53550 


Amvcolatovsis orientaUs 


«6/i53550 


39444 


AmvcolatOBSis orientaUs 


eM39444 


43333 


AmvcolatOBSis orientaUs 


eM43333 


35165 


AmvcolatOBSis orientaUs 


ebh^S16S 



The amino add sequraces encoded by tibe exemplary variants ebh43491, 
ebhimO, ebh53630, ebh53550, ebh39444, ebM3333 and «i»/i35165 are depicted in 

-11- 
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SEQ ID NO:43-49, respectively. Table 2 provides a summary of the amino acid 
substitutions of these exemplary variants. 
Table 2: Amino acid Substitutions 



Position 


ebh 


Substitution 


ebh variant 


100 


Gly 


Ser 


ebhl4930. eWi43333, eWi53550, eM43491 


101 


Lys 


Arg 


«&A14930 


130 


He 


Leu 


«fe/il4930 


192 


Ser 


Gin 


eM14930 


224 


Ser 


Thr 


ebhl4930, ebh43333, ebh53550, ebhA3A9l 


285 


Be 


Val 


«M14930, efeM3333, ebh53550, ebhA3A9l 


69 


Ser 


Asn 


ebhA3333 


256 


Val 


Ala 


eM43333, ebh53550, ebh43491 


93 


Ala 


Ser 


«£>/i53550 


326 


Asp 


Glu 


eWt53550, ebhA3A9\ 


333 


Thr 


Ala 


e&/t53550, ebM3A9\ 


133 


Leu 


Met 


eM43491 


398 


His 


Arg 


e&/i39444 



5 Mutations were also introduced into the coding region of the ebh gene to 

identify mutants with improved yield, and/or rate of bioconversion and/or altered 
substrate specificity. Exemplary mutant nucleic acid sequences of the present 
invention are depicted in SEQ ID NO:30, 32, 34. 60, 62, 64, 66, 68, 70, 72 and 74. 

The nucleic acid sequence of SEQ ID NO:30 encodes a mutant ebK25-\ which 

10 exhibits altered substrate specificity. Plasmid pANT849£2^/z2S-l containing this 
mutant gene was deposited and accepted by an International Depository Authority 
under the provisions of the Budapest Treaty. The deposit was made on November 21, 
20O2 to the American Type Culture Collection at 10801 University Boulevard in 
Manassas, Virginia 201 10-2209. The ATCC Accession Number is PTA-4809. All 

15 restrictions upon public access to this plasmid will be irrevocably removed upon 
granting of this patent application. The Deposit will be maintained in a public 
depository for a period of thirty years after the date of deposit or five years after the 
last request for a sample or for the enforceable life of the patent, whichever is longer. 
The above-referenced plasmid was viable at the time of the deposit. The deposit will 

20 be replaced if viable samples cannot be dispensed by the depository. 

This S, lividans transfonnant identified in the screening of mutation 25 
(primers NPB29-mut25f (SEQ ID NO:58) and NPB29-mut25r (SEQ ID NO:59)) was 
found to produce a product with a different HPLC elution time than epothilone B or 

-12- 
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epothilone F. A sample of this unknown was analyzed by LC-MS and was found to 
have a molecular weight of 523 (M.W.), consistent with a single hydroxylation of 
epothilone B. Plasmid DNA was isolated from the 5. lividans culture and used as a 
template for PGR amplification using primers NPB29-6f (SEQ ID NO:28) and 
5 NPB29-7r (SEQ ID NO:29) (see Example 17). The expected fragment was obtained 
and sequenced using the Big-Dye sequencing systenL The ebh25A mutant was found 
to have two mutations resulting in changes in the amino add sequence of the protein, 
asparagine 195 is changed to serine and serine 294 is changed to proline. The position 
targeted for mutation at codon 238 was found to have a two nucleotide change, which 
10 did not result in a change of the amino acid sequence of the protein. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID N0:30 is depicted in SEQ ID 
N0:3L 

The nucleic acid sequence of SEQ ID NO:32 encodes a mutant ebhlQ-S'i, 
which exhibits improved bioconversion yield. This 5. lividans transformant identified 

15 in the screening of mutation 10 (primers NPB29-mutl0f (SEQ ID NO:54) and 

NPB29-mutlQr (SEQ ID NO:55)) produced a greater yield of epothilone F. Plasmid 
DNA was isolated from the S, lividans culture and used as a template for PGR 
amplification using primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID 
NO:29)(see Example 16). The expected fragment was obtained and sequenced using 

20 the Big-Dye sequencing system. The ebhlQ-S^ mutant was found to have two 

mutations resulting in changes in the amino acid sequence of the protein, glutamic 
acid 231 is changed to arginine and phenylalanine 190 is changed to tyrosine. The 
position 231 was the target of the mutagenesis, the change at residue 190 is an 
inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 

25 sequence of the mutant polypeptide encoded by SEQ ID NO:32 is depicted in SEQ ID 
NO:33. 

The nucleic acid sequence of SEQ ID NO:34 encodes a mutant ebKlA-\6, 
which also exhibits improved bioconversion yield. This S. lividans transformant, 
ebh2A-16 identified in the screening of mutation 24 (primers NPB29-mut24f (SEQ ID 
30 NO:56) and NPB29-mut24r (SEQ ID NO:57) also produced a greater yield of 
epothilone F. Plasmid DNA was isolated from the iS^. lividans culture and used as a 
template for PGR amplification using primers NPB29-6f (SEQ ID NO:28) and 
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NPB29-7r (SEQ ID NO:29). The expected fragment was obtained and sequenced 
using the Big-Dye sequencing system. The cWi24-16 mutant was foxmd to have two 
mutations resulting in changes in the amino acid sequence of the protein, 
phenylalanine 237 is changed to alanine and isoleucine 92 is changed to valine. The 
5 position 237 was the target of the mutagenesis, the change at residue 92 is an 

inadvertent change that is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ n> NO:34 is depicted in SEQ ID 
NO:35. 

The nucleic acid sequence of SEQ ID NO:60 encodes a mutant £M24-16d8, 

10 which also exhibits improved bioconversion yield. This S. rimosus transformant, 
ebh7A-l6d& identified in the screening of mutation 59 (primer NPB29mut59 (SEQ ID 
N0:70)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S. rintosus culture and used as a template for PGR amplification using 
primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 

15 fragment was obtained and sequenced using the Big-Dye sequencing system. The 
e&/i24-16d8 mutant was found to have one mutation resulting in a change in the 
amino acid sequence of the protein, arginine 67 is changed to glutamine. This change 
is an artifact of the mutagenesis procedure. The amino acid sequence of the mutant 
polypeptide encoded by SEQ ID NO:60 is SEQ ID NO:61. 

20 The nucleic acid sequence of SEQ ID NO:62 encodes a mutant ebh2A-l6cl 1, 

which also exhibits improved bioconversion yield. This S, rimosus transformant, 
eWi24-16cl 1 identified in the screening of mutation 59 (primer NPB29mut59 (SEQ 
ID NO:70)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the iS*. rimosus culture and used as a template for PGR amplification using 

25 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
eblQA-l^cll mutant was found to have two additional mutations resulting in changes 
in the amino acid sequence of the protein, alanine 93 is changed to glycine and 
isoleucine 365 is changed to threonine. The position 93 is the target of the . 

30 mutagenesis, the change at 365 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:62 is 
depicted in SEQ ID NO:63, 
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The nucleic acid sequence of SEQ ID NO:64 encodes a mutant efe/i24-16-16, 
which also exhibits improved bioconversion yield. This S. rimosus transformant^ 
ebhTA-lG-lS identified in the screening of random mutants of ebh2A~16 also 
produced a greater yield of epothilone F. Plasmid DNA was isolated fix)m the S. 
5 rimosus culture and used as a template for PGR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fi:agment was 
obtained and sequenced using the Big-Dye sequencing system. The €bh2A-l&'l6 
mutant was found to have one additional mutation resulting in changes in the amino 
acid sequence of the protein, valine 106 is changed to alanine. The amino acid 
10 sequence of the mutant polypeptide encoded by SEQ ID NO:64 is depicted in SEQ ID 
NO:65. 

The nucleic acid sequence of SEQ ID NO:66 encodes a mutant ebh2A-16'74, 
which also exhibits improved bioconversion yield. This S, rimosus transformant, 
ebh2A-16-74 identified in the screening of random mutants of eM24-16 also 

15 produced a greater yield of epothilone F. Plasmid DNA was isolated from the S. 

rifnosus culture and used as a template for PGR amplification using primers NPB29- 
6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected fragment was 
obtained and sequenced using the Big-Dye sequencing system. The e6/z24- 16-74 
mutant was found to have one additional mutation resulting in changes in the amino 

20 acid sequence of the protein, arginine 88 is changed to histidine. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:66 is SEQ ID NO:67. 
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The nucleic acid sequence of SEQ ID NO:68 encodes a mutant e&ft24-M18, 
which also exhibits improved bioconversion yield. This S. riiiiosus transforaiant, 
ebhM'lS identified in the screening of random mutants of ebh also produced a 
greater yield of epothilone F. Plasmid DNA was isolated from the S, rimosus culture 
5 and used as a template for PGR amplification using primers NPB29-6f (SEQ ID 
NO:28) and NPB29-7r (SEQ E) NO:29). The expected fragment was obtained and 
sequenced using the Big-Dye sequencing systennu The mutant was found to 

have two mutations resulting in changes in the amino acid sequence of the protein, 
glutamic acid 31 is changed to lysine and methionine 176 is changed to valine. The 

10 amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:68 is 
depicted in SEQ ID NO:69. 

The nucleic acid sequence of SEQ ID NO:72 encodes a mutant €fc/z24-16g8, 
which also exhibits improved bioconversion yield. This S, rimosus transfonnant, 
ehhOA-iegi identified in the screening of mutation 50 (primer NPB29mut50 (SEQ ID 

15 N0:71)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S, rimosus culture and used as a template for PGR amplification using 
primers NPB29-6f (SEQ ID NO:28) and lS[PB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequencing system. The 
ebh24-16gS mutant was found to have two additional mutations resulting in changes 

20 in the amino acid sequence of the protein, methionine 176 is changed to alanine and 
isoleucine 130 is changed to threonine. The position 176 is the target of the 
mutagenesis, the change at 130 is an artifact of the mutagenesis procedure. The 
amino acid sequence of the mutant polypeptide encoded by SEQ ID NO:72 is 
depicted in SEQ ID NO:73. 

25 The nucleic acid sequence of SEQ ID NO:74 encodes a mutant efeA24-16b9, 

which also exhibits improved bioconversion yield. This S. rimosus transfonnant, 
c&/i24-16b9 identified in die screening of mutation 50 (primer NPB29mut50 (SEQ ID 
NO:71)) also produced a greater yield of epothilone F. Plasmid DNA was isolated 
from the S. rimosus culture and used as a template for PGR amplification using 

30 primers NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29). The expected 
fragment was obtained and sequenced using the Big-Dye sequenciag system. The 
ebh2A-l6b9 mutant was found to have two additional mutations resulting in changes 
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in the amino acid sequence of the protein, methionine 176 is changed to serine and 
alanine 140 is changed to threonine. The position 176 is the target of the mutagenesis, 
the change at 140 is an artifact of the mutagenesis procedure. The amino acid 
sequence of the mutant polypeptide encoded by SEQ ID NO:74 is depicted in SEQ ID 
5 NO:75. 

A mixture coniposed of the plasmids pANT849eWz-24-16, pANT849eM-10- 
53, pANT849eM-24-16d8, pANT849«&ft.24-16cl 1, pANT849efc/i-24-16-16, 
pant849e&/i-24-16-74, pANT849eM-24-16b9, pANT849efe/i-M18 and pANT849ei>/z- 
24-16g8 for these nine mutant genes was deposited and accepted by an International 

10 Depository Authority under the provisions of the Budapest Treaty. The deposit was 
made on November 21, 2002 to the American Type Culture Collection at 10801 
University Boulevard in Manassas, Virginia 20110-2209. The ATCC Accession 
Number is PTA-4808. All restrictions upon public access to this mixture of plasmids 
wiU be irrevocably removed upon granting of this patent application. The deposit will 

15 be maintained in a public depository for a period of thirty years after the date of 
deposit or five years after the last request for a sample or for the enforceable life of 
the patent, whichever is longer. The above-referenced mixture of plasmids was viable 
at the time of the deposit The deposit will be replaced if viable samples cannot be 
dispensed by the depository. 

20 Thus, in accordance with another aspect of the present invention, there are 

provided isolated polypeptides of epothilone B hydroxylase and variants and mutants 
thereof and isolated polypeptides of ferredoxin or variants thereof. In one 
embodiment of the present invention, by "polypeptide" it is meant to include the 
amino acid sequence of SEQ ID NO: 2, and fragments or variants, which retain 

25 essentially the same biological activity and/or function as this epothilone B 

hydroxylase. In another embodiment of the present invention, by "polypeptide" it is 
meant to include the amino acid sequence of SEQ ID N0:4, and fragments and/or 
variants, which retain essentially the same biological activity and/or function as this 
ferredoxin. 

30 By "Variants" as used herein it is meant to include polypeptides with amino 

add sequences with conservative amino acid substitutions as compared to SEQ ID 
NO: 2 or 4 which are demonstrated to exhibit similar biological activity and/or 
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function to SEQ ID NO:2 or 4. By "conservative amino acid substitutions" it is 
meant to include replacement, one for another, of the aliphatic amino acids such as 
Ala, Val, Leu and He, the hydroxyl residues Ser and Thr, the acidic residues Asp and 
Glu, and the amide residues Asn and Ghi. Exemplary variant amino acid sequences 
5 of the present invention are depicted in SEQ ID NO:43-49 and the amino acid 
substitutions of these exemplary variants are described in Table 2, supra. 

By "mutants" as used herein it is meant to include polypeptides encoded by 
nucleic acid sequences with one or more point mutations, or deletions or additions of 
nucleic acids as compared to SEQ ID NO: 1 or 3. but which still have similar activity 

10 to the polypeptides encoded by SEQ ID NO: 1 or 3. In a preferred embodiment, 
mutations are made to the nucleic acid that alter the substrate specificity and/or yield 
from the polypeptide encoded thereby. A preferred region of mutation with respect to 
the epothilone B hydroxylase gene is that region of the nucleic acid sequence coding 
for the approximately 113 amino acid residues comprising the active site of the 

15 enzyme. Also preferred are mutants with at least one amino acid substitution at 
amino acid position GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALAMO, MET176, PHE190, GLU 231, SER294, PHE237, or ILE365 of SEQ ID 
NO:l Exemplary mutants ebhlS-l, eMlO-53, ebfOA-ie, e&/i24-16d8, ebh24-16cU, 
e&A24-16-16, e&/i24-16-74, ^M24-16g8, ebh24-16b9 and the nucleic acid sequences 

20 encoding such mutants of the present invention are depicted in SEQ ID NO: 3 1 , 33, 
35, 61, 63, 65, 67, 69, 71, 73 and 75, and SEQ ID NO:30, 32, 34, 60, 62, 64, 66, 68, 
70, 72 and 74, respectively. 

A 3-dimensional model of epothilone B hydroxylase has also been constmcted 
in accordance with general teachings of Greer et al. (Comparative modeling of 

25 homologous proteins. Methods In Enzymology 202239-52, 1991), Lesk et al. 
(Homology Modeling: Inferences from Tables of Aligned Sequences. Curr. Op. 
Stmc. Biol. (2) 242-247, 1992), and Cardozo et al. (Homology modeling by the ICM 
method. Proteins 23, 403-14, 1995) on the basis of the known structure of a 
homologous protein EryF (PDB Code IKDST chain A). Homology between these 

30 sequences is 34%. Alignment of the sequences of epothilone B hydroxylase (SEQ ID 
NO:2) and EryF (PDB Code IKM chain A; SEQ ID NO:76) is depicted in Figure 3. 
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A homology model of epothilone B hydroxylase based upon sequence alignment with 
EryF is depicted in Figure 4. 

An energy plot of the epothilone B hydroxylase model relative to EryF (PDB 
code ISM) was also prepared and is depicted in Figure 5. An averaging window size 
5 of 51 residues was used at a given residue position to calculate the average of the 
energies of the SI residues in the sequence that lie with the given residue at the central 
position. As shown in Figure 5, all energies along the sequence lie below zero thus 
indicating that the modeled structure as set forth in Figure 4 and Appendix 1 is 
reasonable. 

10 The three-dimensional structure represented in the homology model of 

epothilone B hydroxylase of Figure 4 is defined by a set of structure coordinates as set 
forth in Appendix 1. The term "structure coordinates" refers to Cartesian coordinates 
generated from the building of a homology model. As will be understood by those of 
skill in the art, however, a set of structure coordinates for a protein is a relative set of 

15 points that define a shape in three dimensions. Thus, it is possible that an entirely 
different set of coordinates could define a similar or identical shape. Moreover, slight 
variations in the individual coordinates, as emanate from generation of similar 
homology models using different alignment templates and/or using different methods 
in generating the homology model, will have minor effects on the overall shape. 

20 Variations in coordinates may also be generated because of mathematical 

manipulations of the structure coordinates. For example, the structure coordinates set 
forth in Appendix 1 could be manipulated by firactionalization of the stracture 
coordinates; integer additions or subtractions to sets of the structure coordinates, 
inversion of the structure coordinates or any combination of the above. 

25 Various computational analyses are therefore necessary to determine whether 

a molecule or a portion thereof is sufticientiy similar to all or parts of epothilone B 
hydroxylase described above as to be considered the same. Such analyses may be 
carried out in current software applications, such as S YBYL version 6.7 or 
INSIGHTn (Molecular Simulations Inc., San Diego, CA) version 2000 and as 

30 described in the accompanying User's Guides. 

For example, the superimposition tool in tiie program S YBYL allows 
comparisons to be made between different structures and different conformations of 
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the same structure. The procedure used in SYBYL to compare structures is divided 
into four steps: 1) load the structures to be compared; 2) define the atom equivalencies 
in these structures; 3) perform a fitting operation; and 4) analyze the results. Each 
structure is identified by a name. One stracture is identified as the target (i.e., the 

5 fixed structure); the second structure ^.e., moving structure) is identified as the source 
structure. Since atom equivalency within SYBYL is defined by user input, for the 
purpose of this aspect of the present invention equivalent atoms are defined as protein 
backbone atoms (N, Cot, C and O) for all conserved residues between the two 
structures being compared Further, only rigid fitting operations are considered . 

10 When a rigid fitting method is used, the working structure is translated and rotated to 
obtain an optimum fit with the target stracture. The fitting operation uses an algorithm 
that computes the optimum translation and rotation to be applied to the moving 
structure, such that the root mean square difference of the fit over the specified pairs 
of equivalent atoms is an absolute minimum. This nmnber, given in angstroms, is 

15 reported by SYBYL. 

For the purposes of the present invention, any homology model of epothilone 
B hydroxylase that has a root mean square deviation of conserved residue backbone 
atoms (N, Cot, C, O) of less than about 4.0 A when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 

20 1 are considered identical. More preferably, the root mean square deviation is less 
than about 3.0 A. More preferably the root mean square deviation is less than about 
2.0 A. 

For the purpose of this invention, any homology model of epothilone B 
hydroxylase that has a root mean square deviation of conserved residue backbone 
25 atoms (N, Co, C, O) of less than about 2.0 A when superimposed on the 

corresponding backbone atoms described by structure coordinates listed in Appendix 
1 are considered identical. More preferably, the root mean square deviation is less 
than about LO A. 

In another embodiment of the present invention, structural models wherein 
30 backbone atoms have been substituted with other elements which when superimposed 
on the corresponding backbone atoms have low root mean square deviations are 
considered to be identical. For example, an homology model where the original 
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backbone carbon, and/or nitrogen and/or oxygen atoms are replaced with other 
elements having a root mean square deviation of about 4.0 A, more preferably about 
3.0 A» even more preferably less than about 2A, when superimposed on the 
corresponding backbone atoms described by structure coordinates listed in Appendix 
5 1 is considered identical. 

The term "root mean square deviation" means the square root of the arithmetic 
mean of the squares of the deviations from the mean. It is a way to express the 
deviation or variation from a trend or object. For purposes of this invention, the "root 
mean square deviation" defines the variation in the backbone of a protein firom the 

10 relevant portion of the backbone of the epothilone B hydroxylase portion of the 
complex as defined by the structure coordinates described herein. 

The present invention as embodied by the homology model enables the 
structure-based design of additional mutants of epothilone B hydroxylase. For 
example, using the homology model of the present invention, residues lying within 

15 lOA of the binding site of epothilone B hydroxylase have now been defined. These 
residues include LEU39, GLN43, ALA45, MET57, LEU58, HIS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILESO, ASP84, LYS85, PR086, PHE87, ARG88, PR089, SER90, 
LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, PHEllO, ILE155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237. LEU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LBU283, 
THR287, E.E288, ALA289, GLU290. THR291. ALA292. THR293, SER294. 

25 ARG295, PHE296, ALA297. THR298. GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHE346. GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391, SER392,THR393, ILE394 and TYR395 as set forth in Appendix 
1. Mutants with mutations at one or more of these positions are expected to exhibit 

30 altered biological function and/or specificity and thus comprise another embodiment 
of preferred mutants of the present invention. Another embodiment of preferred 
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mutants are molecules that have a root mean square deviation from the backbone 
atoms of said epothilone B hydroxylase of not more than about 4,0A. 

The structure coordinates of an epothilone B hydroxylase homology model or 
portions thereof are stored in a machine-readable storage mediunu Such data may be 
5 used for a variety of purposes, such as drug discovery. 

Accordingly, another aspect of the present invention relates to machine- 
readable data storage medium comprising a data storage material encoded with the 
structure coordinates set forth in Appendix 1. 

The three-dimensional model structure of epothilone B hydroxylase can also 
10 be used to identify modulators of biological function and potential substrates of the 
enzyme. Various methods or combinations thereof can be used to identify such 
modulators. 

For example, a test compound can be modeled that fits spatially into a binding 
site in epothilone B hydroxylase, according to Appendix 1. Structure coordinates of 

15 amino acids within 10 A of the binding region of epothilone B hydroxylase defined by 
amino acids LEU39, GLN43, ALA45, MET57, LEU58, fflS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PR086, PHE87, ARG88, PR089, SER90, 
LEU91, ILE92, ALA93, MET94, ASP95, fflS99, ARG103, PHEllO, ILE155, 

20 PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177, 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, ALA242, GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250. LEU283, 
THR287, ILE288. ALA289, GLU290, THR291, ALA292, THR293, SBR294, 

25 ARG295. PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHB346. GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, ALA359, GLU362, 
LYS389, ASP391. SER392,THR393, ILE394 and TYR395, and the coordinated 
heme group, HEMl can also be used to identify desirable structural and chemical 

30 features of such modulators. Identified structural or chemical features can then be 
employed to design or select compounds as potential epothilone B hydroxylase 
ligands. By structural and chemical features it is meant to include, but is not limited 
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to, covalent bonding, van der Waals interactions, hydrogen bonding interactions, 
charge interaction, hydrophobic bonding interaction, and dipole interaction. 
Compounds identified as potential epothilone B hydroxylase ligands can then be 
synthesized and screened in an assay characterized by binding of a test compound to 
5 epothilone B hydroxylase, or in characterizing the ability of epodiilone B hydroxylase 
to modulate a protease target in the presence of a small molecule. Examqples of assays 
useful in screening of potential epothilone B hydroxylase ligands include, but are not 
limited to, screening in silico, in vitro assays and high throughput assays. 

As will be understood by those of skill in the art upon this disclosure, other 

10 structure-based design methods can be used. Various computational structure-based 
design methods have been disclosed in the art. For example, a number of computer 
modeling systems are available in which the sequence of epothilone B hydroxylase 
and the epothilone B hydroxylase structure (i.e., atomic coordinates of epothilone B 
hydroxylase as provided in Appendix 1 and/or the atomic coordinates within lOA of 

15 the binding region as provided above) can be input. This computer system then 
generates the structural details of one or more these regions in which a potential 
epothilone B hydroxylase modulator binds so that complementary structural details of 
the potential modulators can be determined. Design in these modeling systems is 
generally based upon the compound being capable of physicaUy and structurally 

20 associating with epothilone B hydroxylase. In addition, the compound must be able 
to assume a conformation that allows it to associate with epothilone B hydroxylase. 
Some modeling systems estimate the potential inhibitory or binding effect of a 
potential epothilone B hydroxylase substrate or modulator prior to actual synthesis 
and testing. 

25 Methods for screening chemical entities or fragments for then: ability to 

associate with a given protein target are also well known. Often these methods begm 
by visual inspection of the binding site on the computer screen. Selected jfragments or 
chemical entities are then positioned in a binding region of epothilone B hydroxylase. 
Docking is accomplished using software such as INSIGHTU, QUANTA and SYB YL, 

30 following by energy iiunixxuzation and molecular dynamics with standard molecular 
mechanic force fields such as, MMFF, CHARMM and AMBER. Examples of 
computer programs which assist in the selection of chemical fragment or chemical 
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entities useful in the present invention include, but are not limited to, GRID 
(Goodford, 1985), AUTODOCK (GoodseU, 1990), and DOCK (Kuntz et al. 1982). 

Upon selection of preferred chemical entities or fragments, their relationship 
to each other and epothilone B hydroxylase can be visualized and then assembled into 

5 a single potential modulator. Programs useful in assembling the individual chenoical 

i 

entities include, but are not limited to CAVEAT (Bartlett et al. 1989) and 3D 
Database systems (Martin 1992). 

Alternatively, compounds may be designed de novo using either an empty 
active site or optionally including some portion of a known inhibitor. Methods of this 

10 type of design include, but are not limited to LUDI (Bohm 1992) and LeapFrog 
(Tripos Inc., St Louis MO). 

Programs such as DOCK (Kuntz et al. 1982) can be used with the atomic 
coordinates from the homology model to identify potential ligands from databases or 
virtual databases which potentially bind the in the active site binding region which 

15 may therefore be suitable candidates for synthesis and testing. 

Also provided in the present invention are vectors comprising polynucleotides 
of the present invention and host cells which are genetically engineered with vectors 
of the present invention to produce epothilone B hydroxylase or active fragments and 
variants or mutants of this enzyme and/or ferredoxin or active fragments thereof. 

20 Generally, any vector suitable to maintain, propagate or express polynucleotides to 
produce these polypeptides in the host cell may be used for expression in this regard. 
In accordance with this aspect of the invention the vector may be, for example, a 
plasmid vector, a single- or double-stranded phage vector, or a single- or double- 
stranded RNA or DNA viral vector. Vectors may be extra-chromosomal or designed 

25 for integration into the host chromosome. Such vectors include, but are not limited to, 
chromosomal, episomal and virus^derived vectors e.g., vectors derived from bacterial 
plasmids, bacteriophages, yeast episomes, yeast chromosomal elements, and viruses 
such as baculoviruses, papova viruses, S V40, vaccinia viruses, adenoviruses, fowl 
pox viruses, pseudorabies viruses and retroviruses, and vectors derived from 

30 combinations thereof, such as those derived fix)m plasmid and bacteriophage genetic 
elements, cosmids and phagemids. 
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Useful expression vectors for prokaryotic hosts include, but are not limited to, 
bacterial plasmids, such as those from E. colU Bacillus or Streptomyces, including 
pBluescript, pGEX-2T, pUC vectors, pET vectors, ColEl, pCRl, pBR322, pMB9, 
pCW, pBMS200, pBMS2020, PDlOl, Pn702, pANT849, pOJ260, pOJ446. 
5 pSET152, pKCl 139, pKC1218, pFD666 and their derivatives, wider host range 

plasnaids, such as RP4, phage DNAs, the numerous derivatives of phage lambda, 
e.g., NM989, A,GT10 and X,GT11, and other phages, e.g., M13 and filamentous single 
stranded phage DNA. 

Vectors of the present invention for use in yeast will typically contain an 

10 origin of replication suitable for use in yeast and a selectable maricer that is functional 
in yeast Examples of yeast vectors useful in the present invention include, but are not 
limited to. Yeast Integrating plasmids (e.g., YIp5) and Yeast Replicating plasmids 
(the YRp and YEp series plasmids), Yeast Centromere plasmids (the YQp series 
plasmids). Yeast Artificial Chromosomes (YACs) which are based on yeast linear 

15 plasmids, denoted YLp, pGPD-2, 2\i plasmids and derivatives thereof, and improved 
shuttle vectors such as those described in Gietz et a/.. Gene, 74: 527-34 (1988) 
(YIplac, YEplac and YCplac). 

Mammalian vectors useful for recombinant expression may include a viral 
origin, such as the S V40 origin (for replication in cell lines expressing the large 

20 T-antigen, such as COSl and COS7 ceUs), the papillomavirus origin, or the EBV 
origin for long term episomal replication (for use, ^.g., in 293-EBNA cells, which 
constitutively express the EBV EBNA-1 gene product and adenovims El A). 
Expression in mammalian cells can be achieved using a variety of plasmids, 
including, but not limited to, pSV2, pBC12BI, and p91023, pCDNA vectors as well as 

25 lytic virus vectors (e.g., vaccinia virus, adeno virus, and baculovirus), episomal virus 
vectors (e.g., bovine papillomavirus), and retroviral vectors (e.g.i murine 
retroviruses). Useful vectors for insect cells include baculoviral vectors and pVL941. 

Selection of an appropriate promoter to direct mRNA transcription and 
constmction of expression vectors are well known. In general, however, expression 

30 constructs will contain sites for transcription initiation and termination, and, in the 
transcribed region, a ribosome binding site for translation. The coding portion of the 
mature transcripts expressed by the constructs will include a translation initiating 
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codon at the beginning and a tennination codon appropriately positioned at the end of 
the polypeptide to be translated. 

Examples of useful promoters for prokaryotes include, but are not limited to 
phage promoters such as phage lambda pL promoter, the trc promoter, a hybrid 
5 derived firom the trp and lac promoters, the bacteriophage T7 promoter, the TAG or 
TRC system, the major opemtor and promoter regions of phage lambda, the control 
regions of fd coat protein, snpA promoter, melC promotor, emiE* promoter or the 
ardBAD operon. Examples of useful promoters for yeast include, but are not limited 
to, the CYCl promoter, the GALl promoter, the GALIO promoter, ADHl promoter, 

10 the promoters of the yeast a-mating system, and the GPD promoter. Examples of 
promoters routinely used in mammalian expression vectors include, but are not 
limited to, the CMV immediate early promoter, the HS V thymidine kinase promoter, 
the early and late SV40 promoters, the promoters of retroviral LTRs, such as those of 
the Rous Sarcoma Virus(RSV), and metallothionein promoters, such as the mouse 

15 metallothionein-I promoter. 

Vectors comprising the polynucleotides can be introduced into host cells using 
any number of well known techniques including infection, transduction, transfection, 
transvection and transformation. The polynucleotides may be introduced into a host 
alone or with additional polynucleotides encoding, for example, a selectable marker 

20 or ferredoxin reductase. In a preferred embodiment of the present invention the 
polynucleotide for epothilone B hydroxylase and ferredoxin are introduced into the 
. host cell. Host cells for the various expression constructs are well known, and those 
of skill can routinely select a host cell for expressing the epothilone B hydroxylase 
and/or fenedoxia in accordance with this aspect of the present invention. Examples 

25 of mammalian expression systems useful in the present invention include, but are not 
lunited to, the C127, 3T3, CHO, HeLa, human kidney 293 and BHK cell lines, and 
the COS-7 line of monkey kidney fibroblasts. 

Alternatively, as exemplified herein, epothilone B hydroxylase and ferredoxin 
can be expressed recombinanfly in microorganisms. 

30 Accordingly, another aspect of the present invention relates to recombinantly 

produced microorganisms which express epothilone B hydroxylase alone or in 
conjunction with the ferredoxin and which are capable of hydroxylating a compound , 
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and in particular an epothilone, having a terminal alkyl group to produce ones having 
a terminal hydroxyalkyl group. The recombinantly produced microorganisms are 
produced by transforming cells such as bacterial cells widi a plasmid comprising a 
nucleic acid sequence encoding epothilone B hydrox:^ase» In a preferred 
5 embodiment, the cells are transformed with a plasmid comprising a nucleic add 
encoding epothilone B hydroxylase or mutants or variants thereof as well as the 
nucleic acid sequence encoding ferredoxin located downstream of the epothilone B 
hydroxylase gene. Examples of microorganisms which can be transformed with these 
plasmids to produce the recombinant microorganisms of the present invention 

10 include, but are not limited, Escherichia coli, Bacillus megaterium, Amycolatopsis 
orientalis, Soroftgium cellulosum, Rhodococcus erythropolis, and Streptomyces 
species such as Streptomyces lividans, Streptomyces virginiae, Streptomyces 
venezuelae, Streptomyces albus, Streptomyces coelicolor^ Streptomyces rimosus and 
Streptomyces griseus, 

15 The recombinantly produced microorganisms of the present invention are 

useful in microbial processes or methods for production of compounds, and in 
particular epothilones, containing a terminal hydroxyalkyl group. In general, the 
hydroxyalkyl-bearing product can be produced by culturing the recombinantly 
produced microorganism or enzyme derived therei&rom, capable of selectively 

20 hydroxylating a terminal carbon or alkyl, in the presence of a suitable substrate in an 
aqueous nutrient medium containing sources of assimilable carbon and nitrogen, 
under submerged aerobic conditions. 

Suitable epothilones employed as substrate for the method of the present 
invention may be any such compound having a terminal carbon or terminal alkyl 

25 group capable of undergoing the enzymatic hydroxylation of the present invention. 
The starting material, or substrate, can be isolated from natural sources, such as 
Sorangium celMoswn, or they can be synthetically formed epothilones. Other 
substrates having a terminal carbon or terminal alkyl group capable of undergoing an 
erizymatic hydroxylation can be employed by the methods herein. For example, 

30 compactin can be used as a substrate, which upon hydroxylation forms the compound 
pravastatin. Methods for hydroxylating compactin to pravastatin via an 
Actinomadura strain are set forth in U.S. Patent 5,942,423 and U.S. Patent 6,274,360. 
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For example, using the recombinant microorganisms of the present invention 
at least one epothilone can be prepared as described in WO 00/39276, U. S. Serial. 
No. 09/468,854, filed December 21, 1999, the text of which is incorporated herein as 
if set forth at length. An epothilone of the following Formula I 
5 HO-CH2-(Ai)n-(Q)m-(A2)o-E (I) 

where 

Ai and A2 are independently selected from the group of optionally substituted 
C1-C3 alkyl and alkenyl; 

Q is an optionally substituted ring system containing one to three rings and at 
10 least one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected firom the group consisting of zero and 1, 
where at least one of m or n or o is 1; and 

E is an epothilone core; can be prepared. 
This method comprises the steps of contacting at least one epothilone of the following 
15 formula II 

CH3-(Ai)n-(Q)m-(A2VE (II) 

where Ai, Q, A2, E, n, m, and o are dejQned as above; 
with a recombinantly produced microorganism, or an enzyme derived 
therefrom, which is capable of selectively catalyzing the hydroxylation of formula II, 
20 and effecting said hydroxylation. 

In a preferred embodiment, the starting material is epothilone B. Epothilone B 
can be obtained from the fermentation of Sorangiwn cellulosum So ce90, as described 
in DE 41 38 042 and WO 93/1012L The strain has been deposited at the Deutsche 
Sanunlung von Mikroorganismen (Geiman Collection of Microorganisms) (DSM) 
25 under No. 6773. The process of fermentation is also described in Hofle, G., et al., 
Angew. Chenu Int. Ed. Engl, Vol 35, No. 13/14, 1567-1569 (1996). Epothilone B can 
also be obtained by chemical means, such as those disclosed by Meng, D., et al., /. 
Am. Chem. Soc, Vol. 119. No. 42, 10073-10092 (1996); Nicolaou, K., et al., J. Am. 
Chenu Soc, Vol. 119, No. 34, 7974-7991 (1997) and Schinzer, D., et al., Chenu Eur. 
30 7., Vol. 5. No. 9, 2483-2491 (1999). 

Growth of the recombinantly produced microorganism selected for use in the 
process may be achieved by one of ordinary skill in the art by the use of appropriate 
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nutrient medium. Appropriate media for the growing of the recombinantly produced 
microorganisms include those that provide nutrients necessary for the growth of 
microbial cells. See, for example, T. Nagodawithana and J. M. Wasilesld, Chapter 2: 
^Media Design for Industrial Fermentations," Nutritional Requirements of 
5 Commercially Important Microorjganism> edited by T. W. Nagodawithana and G- 
Reed. Esteekay Associates, Inc., Milwaukee, WI, 18-45 (1998); T. L. MiUer and B. 
W. Churchill, Chapter 10: "Substrates for Large-Scale Fermentations," Manual of 
Industrial Microbiology and Biotechnology, edited by A.L. Demain and N. A. 
Solomon, American Society for Microbiology, Washington, D.C., 122-136 (1986), A 

10 typical medium for growth includes necessary carbon sources, nitrogen sources, and 
trace elements. Inducers may also be added to the medium. The term inducer as used 
herein, includes any compound enhancing formation of the desired enzymatic activity 
within the recombinantly produced microbial cell. Typical inducers as used herein 
may include solvents used to dissolve substrates, such as dimethyl sulfoxide, dimethyl 

15 forman[iide, dioxane, ethanol and acetone. Further, some substrates, such as 
epothilone B, may also be considered to be inducers. 

Carbon sources may include sugars such as glucose, fructose, galactose, 
maltose, sucrose, mannitol, sorbital, glycerol starch and the like; organic acids such as 
sodium acetate, sodium citrate, and the like; and alcohols such as ethanol, propanol 

20 and the like. Preferred carbon sources include, but are not limited to, glucose, 
fructose, sucrose, glycerol and starch. 

Nitrogen sources may include an N-Z amine A, com steeped liquor, soybean 
meal, beef extract, yeast extract, tryptone, peptone, cottonseed meal, peanut meal, 
amino acids such as sodium glutamate and the like, sodium nitrate, ammonium sulfate 

25 and the like. 

Trace elements may include magnesium, manganese, calcium, cobalt, nickel, 
iron, sodium and potassium salts. Phosphates may also be added in trace or 
preferably, greater than trace amoimts. 

The noiedium employed for the fermentation may include more than one 
30 carbon or nitrogen source or other nutrient. 

For growth of the recombinantly produced microorganisms and/or 
hydroxylation according to the method of the present invention, the pH of the medium 
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is preferably from about 5 to about 8 and the temperature is from about 14*^C to about 
37°C, preferably the temperature is 28''C. The duration of the reaction is 1 to 100 
hours, preferably 8 to 72 hours* 

The medium is incubated for a period of time necessary to complete the 
5 biotransformation as monitored by high performance Uquid chromatography (HPLQ. 
Typically, the period of time needed to complete the transformation is twelve to one 
hundred hours and preferably about 72 hours after the addition of the substrate. The 
medium is placed on a rotary shaker (New Brunswick Scientific Ihnova 5000) 
operating at ISO to 300 rpm and preferably about 250 rpm with a throw of 2 inches. 

10 The hydroxyalkyl-bearing product can be recovered from the fermentation 

broth by conventional means that are commonly used for the recovery of other known 
biologically active substances. Examples of such recovery means include, but are not 
limited to, isolation and purification by extraction with a conventional solvent, such as 
ethyl acetate and the like; by pH adjustment; by treatment with a conventional resin, 

15 for example, by treatment with an anion or cation exchange resin or a non-ionic 
adsorption resin; by treatment with a conventional adsorbent, for example, by 
distillation, by crystallization; or by recrystallization, and the like. 

The extract obtained above from the biotransformation reaction mixture can be 
further isolated and purified by column chromatography and analytical thin layer 

20 chromatography. 

The ability of a recombinantly produced nucroorganism of the present 
invention to biotransf orm an epothilone having a terminal alkyl group to an 
epothilone having a terminal hydroxyalkyl group was demonstrated. In these 
experiments, a culture comprising a Streptomyces lividans clone containing a plasmid 

25 with the ebh gene as described in more detail in Example 1 1 was incubated with an 
epothilone B suspension for 3 days at 30®with agitation. A sample of the incubate 
was extracted with an equal volume of 25% methanol: 75% n-butanol, vortexed and 
allowed to settle for 5 minutes. Two hundred \sX of the organic phase was transferred 
to an HPLC vial and analyzed by HPLC/MS (Example 12). A product peak of 

30 epothilone F eluted at a retention time of 15.9 minutes and had a protonated molecular 
weight of 524. The epothilone B substrate eluted at 19.0 minutes and had a 
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piotonated molecular weight of 508. The peak retention times and molecular weights 
were confirmed using known standards. 

Rates of biotransformation of epothilone B by cells expressing ebh were also 
compared to rates of biotransformation by ebh mutants. Cells expressing ebh 

5 comprised a frozen spore preparation of. S. lividans (pANT849-e&A). Cells 

e}^ressmg mutants comprises frozen spore preparations of S. lividans (pANT849- 
€&/ilO-53) and S, lividans (pANT849-e&A24-16). A fix>zen spore preparation of S. 
lividans TK24 was used as the control. The cells were pre-incubated for several days 
at 30°C. Following this pre-incubation, epothilone B in 100% EtOH was added to 

10 each culture to a final concentration of 0.05% weight/volume. Samples were then 
taken at 0, 24, 48 and 72 hours with the exception of the 5. lividans (pANT849- 
ebh24-l6) culture, in which the epothilone B had been completely converted to 
epothilone F at 48 hours. The samples were analyzed by HPLC. The results are 
calculated as a percentage of the epothilone B at time 0 hours. 

15 

Epothilone B: 

Tiine(hour^ TK24 pANT849-«W» pANT849-cWilO-53 pANT849-eirA24.16 

0 100% 100% 100% 100% 

24 99% 78% 69% 56% 

48 87% 19% 39% 0% 

72 87% 0% 3% — 



Epothilone F: 



TimeQioars) 


TK24 




pANT849-eM10-53 


pANT849-eM24-16 


0 


0% 


0% 


0% 


0% 


24 


0% 


4% 


9% 


23% 


48 


0% 


21% 


29% 


52% 


72 


0% 


14% 


41% 





20 The ability of cells expressing ebh to biotransform conapactin to pravastatin 

was also examined. In these experiments, frozen spore preparations of 5. lividans 
(pANT849) or S. lividans (pANT849-efe/i) were grown for several days at 30°C. 
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Following the pre-incubation, an aliquot of each cell culture was transfeired to a 
polypropylene culture tube, compactin was added to each culture tube, and the tubes 
were incubated for 24 hours, 30**C, 250 ipm. An aliquot of the culture broth was then 
extracted and compactin and pravastatin values relative to the control S. lividans 
5 (pANT849) culture were measured via HPLC. 



Compactin and pravastatin as a percentage of starting compactm 
concentration: 





S. lividans (pAmm) 


S. lividans (pANT849-eM) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



As discussed supra, mutant ebJt25-l (SEQ ID NO:30) exhibits altered 
10 substrate specificity and biotransformation of epothilone B by this mutant resulted in 
a product with a different HPLC elution time than epothilone B or epothilone F. A 
sample of this unknown was analyzed by LC-MS and was found to have a molecular 
weight of 523 ^.W.), consistent with a single hydroxylation of epothilone B. The 
stracture of the biotransformation product was determined as 24-hydroxyl-epothilone 
15 B, based on MS and NMR data (compared with data of epothilone B): 




24-hydroxyI-epothilone B 
Formula A 

Molecular Formula: C27H41NO7 S 
20 Molecular Weight: 523 

Mass Spectrum: ES+ (m/z): 524([M+H]^, 506. 

LOMS/MS: +ESI (m/z): 524, 506, 476, 436, 320 

HRMS: Calculated for [M+H]'': 524.2682; Found: 524.2701 
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HPLC (Rt) 7.3 minutes (on the analytical HPLC system) 

LC/NMR Observed Chemical Shifts 

Varian AS-600 (Proton: 599.624 MHz), 
Solvent D2O/CD3CN (5 1.94): ^6 
5 Proton: 57.30 (s, IH). 6.43 (s, IH), 5.30 (m, IH), 4.35 (m, IH), 

3.81 (m, IH), 3.74 (m, IH), 3.68 (m, IH), 3.43 (m, IH), 2.87 
(m, IH), 2.66 (s, 3H), 2.40 (m, 2H), 1.58 (b, IH), 1.48 (b, IH), 
1.35 (m, 3H), 1.18 (s, 3H), 1.13 (s, 3H), 0.87 (m, 6H) 
♦Peaks between 1.8-2.1 ppm were not observed due to solvent 
10 suppression. 

The proton chemical shift was assigned as follows: 



15 



20 



25 



30 



Position 


Proton 


Pa 


1 


— 




2 


2.40 


m 


3 


4.35 


m 


4 






5 






6 


3.43 


m 


7 


3.68 


m 


8 


1.58 


m 


9 


1.35 


b 


10 


1.48 


b 


10 


1.35 


b 


11 


SSP 




12 






13 


2.87 


m 


14 


SSP 




15 


5.30 


m 


16 






17 


6.43 


s 


18 
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19 


7.30 


s 




20 








21 


2.66 


s 




22 


1.18 


s 


5 


23 


0.87 


m 




24 


3.81 


m 




24 


3.74 


m 




25 


0.87 


m 




26 


1.13 


s 


10 


27 


SSP 





*SSP: no observed due to solvent suppression. 

Accordingly, the compositions and methods of the present invention are useful 
in producing known compounds that are microtubule-stabilizing agents as well as new 
compounds comprising epothilone analogs such as 24-hydroxyl-epothilone B 

15 (Formula A) and pharmaceutically acceptable salts thei'eof expected to be useful as 
microtubule-stabilizing agents. The microtubule stabilizing agents produced using 
these compositions and methods are useful in the treatment of a variety of cancers and 
other proliferative diseases including, but not limited to, the following; 

carcinoma, including that of the bladder, breast, colon, kidney, liver, lung, 

20 ovary, pancreas, stomach, cervix, thyroid and skin; including squamous cell 
carcinoma; 

hematopoietic tumors of lymphoid lineage, including leukemia, acute 
lymphocytic leukemia, acute lymphoblastic leukemia, B-cell lymphoma, TkmsII 
lymphoma, Hodgkins lymphoma, non-Hodgkins lymphoma, hairy cell lymphoma and 
25 Burketts lymphoma; 

hematopoietic tumors of myeloid lineage, including acute and chronic 
myelogenous leukemias and promyelocytic leukemia; 

tumors of mesenchymal origin, including fibrosarcoma and 
rhabdomyoscarcoma; 
30 - other tumors, including melanoma, seminoma, tetratocarcinonoLa, 
neuroblastoma and glioma; 



-34- 



wo 2004/061116 



PCTAJS2003/034082 



tumors of the central and peripheral nervous system, including astrocytoma, 
neuroblastoma, glioma, and schwannomas; 

tumors of mesenchymal origin, including fibrosarcoma, rhabdomyosarcoma, 
and osteosarcoma; and 
5 - other tumors, including melanoma, xenoderma pigmentosinn, 

keratoactanthoma, seminoma, thyroid follicular cancer and teratocarcinoma. 

Microtubule stabilizing agents produced using the compositions and methods 
of the present invention will also inhibit angiogenesis, thereby affecting the growth of 
tumors and providing treatment of tumors and tumor-related disorders. Such anti- 

10 angiogenesis properties of these compounds will also be usefid in the treatment of 
other conditions responsive to anti-angiogenesis agents including, but not limited to, 
certain forms of blindness related to retinal vascularization, arthritis, especially 
inflammatory arthritis, multiple sclerosis, restinosis and psoriasis. 

Microtubule stabilizing agents produced using the compositions and methods 

15 of the present invention will induce or inhibit apoptosis, a physiological cell death 
process critical for normal development and homeostasis. Alterations of apoptotic 
pathways contribute to the pathogenesis of a variety of human diseases. Compounds 
of the present invention such as those set forth in formula I and E and Formula A, as 
modulators of apoptosis, will be useful in the treatment of a variety of human diseases 

20 with aberrations in apoptosis including, but not limited to, cancer and precancerous 
lesions, immune response related diseases, viral infections, degenerative diseases of 
the musculoskeletal system and kidney disease. 

Without wishing to be bound to any mechanism or morphology, microtubule 
stabilizing agents produced using the compositions and methods of the present 

25 invention may also be used to treat conditions other than cancer or other proliferative 
diseases. Such conditions include, but are not limited to viral infections such as 
herpesvirus, poxvims, Epstein-Barr vims, Sindbis vims and adenovirus; autoimmune 
diseases such as systemic lupus erythematosus, unmune mediated glomerulonephritis, 
rheun:iatoid arthritis, psoriasis, inflammatory bowel diseases and autoinomune diabetes 

30 mellitus; neurodegenerative disorders such as Alzheimer's disease, AIDS-related 
dementia, Parkinson's disease, amyotrophic lateral sclerosis, retinitis pigmentosa, 
spinal muscular atrophy and cerebellar degeneration; ADDS; myelodysplastic 
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syndromes; aplastic anemia; ischemic injury associated myocardial infarctions; stroke 

and reperfusion injury; restenosis; arrhythmia; atherosclerosis; toxin-induced or 

alcohol induced liver diseases; hematological diseases such as chronic anemia and 

aplastic anemia; degenerative diseases of the musculoskeletal system such as 

5 osteoporosis and arthritis; aspirin-sensitive rhinosinusitis; cystic fibrosis; multiple 

sclerosis; kidney diseases; and cancer pain. 

The following nonlimiting examples are provided to further illustrate the 
present invention. 
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EXAMPLES 

Example 1: Reagents 
R2 Medium was prepared as follows: 
5 A solution containing sucrose (103 grams), K2SO4 (0.25 grains) Mga2*6H20 

(10.12 granois), glucose (10 grams), Difco Casaminoacids (0.1 grams) and distilled 
water (800 ml) was prepared. Eighty ml of this solution was then poured into a 200 
ml screw capped bottle containing 2.2 grams Difco Bacto agar. The botde was 
capped and autoclaved. At time of use, the medium was remelted and the following 
10 autoclaved solutions were added in the order listed: 
lmlKH2PO4(0.5%) 
8 ml CaCl2*2H20 (3.68%) 
1.5 ml L-prolme (20%) 
10 ml TES buffer (5.73%, adjusted to pH 7.2) 
15 0.2 ml Trace element solution containing ZnCl2(40mg), FeCl3*6H2O(200 mg), 

Cua2*2H20 (10 mg), MnCWHzO (10 mg), Na2B4O7»10H2O (10 mg), and 

(NH4)6M07024*H20 

0,5 ml NaOH ( lN)(sterilization not required) 

0.5 ml Required growth factors for auxotrophs (Histidine (50 llg/ml); Cysteine 
20 (37 Mg/ml); adenine, guanine, thymidine and uracil (7.5 p.g/ml); and Vitamins (0.5 
|Xg/ml). 

R2YE medium was prepared in the same fashion as R2 medium. However, 5 ml of 
Difco yeast exixact (10%) was added to each 100 ml flask at time of use. 

P (protoplast) buffer was prepared as follows: 

A basal solution made up of the following was prepared: 
Sucrose (103 grams) 
K2SO4 (0.25 grams) 
MgCl2*6H20 (2.02 grams) 

Trace Element Solution as described for R2 medium (2 ml) 
Distilled water to 800 ml 



25 



30 
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Eighty ml aliquots of the basal solution were then dispensed and autoclaved. Before 
use, the following was added to each flask in the order listed: 



T (transformation) buffer was prepared by mixing the following sterile solutions: 
25 ml Sucrose (10.3%) 
75 ml distilled water 



The following are then added to 9.3 mis of this solution: 
0.2mlCaCl2(5M) 

0.5 ml Tris maleic acid buffer prepared from 1 M solution of Tris adjusted to 
pH 8.0 by adding maleic acid. 
15 For use, 3 parts by volume of the above solution are added to 1 part by weight of PEG 
1000, previously sterilized by autoclaving. 

L flysis) buffer was prepared by mixing the following sterile solutions: 



5 



lmlKH2PO4(0.5%) 

10 ml Caa2*2H20 (3.68%) 

TES buffer (5.75%, adjusted to pH 7.2) 



10 



1 ml Trace Element Solution as described for R2 medium 
lmlKaS04(2.5%) 



20 



100 ml Sucrose (10.3%) 

10 ml TES buffer (5.73%, adjusted to pH 7.2) 

1 ml K2SO4 (2.5%) 



1 ml Trace Element Solution as described for R2 medium. 



25 



lmlKH2PO4(0.5%) 

0.1 ml MgCl2«6H20 (2.5 M) 

lmlCaa2(0.25M) 
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CRM Medium 

A solution containing the following components was prepared in 1 liter of 
dHaO: glucose (10 grams), sucrose (103 grams), MgCl2*6H20 (10.12 grams), BBL™ 
trypticase soy broth (15 grams) (Becton Dickinson Microbiology Systems, Sparks, 
5 Maryland, USA), and BBL™ yeast extract (5 grams) (Becton Dickinson 

Microbiology Systems). The solution was autoclaved for 30 minutes. Thiostrepton 
was added to a concentration of 10 |ig/ml for cultures propagated with plasmids. 

Electroporation Bii£Fer 

10 A solution containing 30% (wf/vol) PEG 1000, 10% glycerol, and 6.5 % 

sucrose was prepared in dHiO. The solution was sterilized by vacuum filtration 
tiirough a 0.22 jim cellulose acetate filter. 

Example 2: Extraction of Chromosomal DNA from Strain SC15847 

15 Genomic DNA was isolated fi^om an Antycolatopsis orientalis soil isolate 

strain designation SC15847 (ATCC PT-1043) using a guanidine-detergent lysis 
method, DNAzol reagent (Invitrogen, Carlsbad, California, USA). The SC15847 
culture was grown 24 hours at 28''C in F7 medium (glucose 2.2%, yeast extract 1.0%, 
malt extract 1.0 %, peptone 0.1%, pH 7.0). Twenty ml of culture was harvested by 

20 centrifiigation and resuspended in 20 ml of DNAzol, mixed by pipetting and 

centrifiiged 10 minutes in the Beckman TJ6 centrifuge. Ten ml of 100% ethanol was 
added, inverted several times and stored at room temperature 3 minutes. The DNA 
was spooled on a glass pipette washed in 100% ethanol and allowed to air dry 10 
minutes. The pellet was resuspended in 500 pi of 8mM NaOH and once dissolved it 

25 was neutralized with 30 pi of IM HEPES pH7-2. 

Example 3: PGR Reactions 

PGR reactions were prepared in a volume of 50 pi, containing 200-500 ng of 
genomic DNA or 1.0 pi of the cDNA, a forward and reverse primer, and the forward 
30 primer being either P450-1'*' (SEQ ID NO:23) or P450-la"'(SEQ ID NO:24) or P450- 
2'^(SEQ ID NO:25) and the reverse primer P450-3- (SEQ ID NO:27)or P450-2'(SEQ 
ID NO:26). All primers were added to a fmal concentration of 1.4- 2.0 pM. The PGR 
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reactioa was prepared with 1 pi of Taq enzyme (2,5 units) (Stratagene), 5 |Jl of Taq 
buffer and 4 |Jl of 2.5 mM of dNTPs with dHaO to 50 The cycling reactions were 
performed on a Geneamp® PGR system with the following protocol: 95°C for 5 
minutes, 5 cycles [95''C 30 seconds, 37°C 15 seconds (30% ramp), 72°C 30 seconds], 
5 35 cycles (94''C 30 seconds, 65*^0 15 seconds, 72*^0 30 seconds), 72'*C 7 minutes. 
The expected sizes for the reactions are 340 bp for the P450-1'*' (SEQ ID NO:23) or 
P450-la"' (SEQ ID NO:24) and P450-3- (SEQ ID NO:27) primer pairs, 240 bp for the 
P450-1"' (SEQ ID NO:23) and P450-r (SEQID NO:26) primer pairs and 130 bp for 
the P450-2'" (SEQ ID NO:25) and P450-3" (SEQ ID NO:27) primer pairs. 

10 

Example 4: Cloning of Epothilone B Hydroxylase and Ferredoxin Genes 

Twenty jlg of SC15847 genomic DNA was digested with BgUI restriction 
enzyme for 6 hours at 37^C. A 30k nanosep column (Gelman Sciences, Ann Arix)r, 
Michigan, USA) was used to concentrate the DNA and remove the enzyme and 

15 buffer. The reactions were concentrated to 40 and washed with 200 pi of TE. The 
digestion products were then separated a 0.7% agarose gel and genomic DNA in the 
range of 12-15 kb was excised from the gpl and purified using the Qiagen gel 
extraction method. The genomic DNA was then ligated to plasmid pWB19N (U.S. 
Patent 5,516,679), which had been digested with BamHI and dephosphorylated using 

20 the SAP I enzyme (Roche Molecular Biochemicals, Indian£q)olis, Indiana, catalog#l 
758 250). Ligation reactions were performed in a 15 |il volume with lU of T4 DNA 
ligase (Invitrogen) for 1 hour at room temperature. One pi of the hgation was 
transformed to 100 jxl of chemically competent DHIOB cells (Invitrogen) and 100 pi 
plated to five LB agar plates with 30 ^g/ml of neomycin, 37®C overnight. 

25 Five nylon membrane circles (Roche Molecular Biochemicals, Indianapolis, 

Indiana) woie numbered and marked for orientation. The membranes were placed on 
the plates 2 minutes and then allowed to dry for 5 minutes. The membranes were then 
placed on Whatman filter disks saturated with 10% SDS for 5 minutes, 0.5N NaOH 
with L5 M NaCl for 5 minutes, 1.5 M NaCl with 1.0 M Tris pH 8.0 for 5 minutes, 

30 and 15 minutes on 2X SSC. The filters were hybridized as described previously for 
the Southern hybridization. Hybridizing colonies were picked to 2 ml of TB with 30 
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jig/ml neomycin and grown overnight at 37°C. Plasmid DNA was isolated using a 
noiniprep column procedure (Mo Bio). This plasmid was named NPB29-L 

Example 5: DNA Sequencing and Analysis 
5 The cloned PGR products were sequenced using fluorescent-dye-labeled 

terminator cycle sequencing, Big-Dye sequencing kit (Applied Biosystems, Foster 
city, California, USA) and were analyzed using laser-induced fluorescence capillary 
electrophoresis, ABI Prism 310 sequencer (Applied Biosystems). 

10 Example 6: Extraction of Total RNA 

Total RNA was isolated jfrom the SC15847 culture using a modification of the 
Chomczynski and Sacchi method with a mono-phasic solution of phenol and 
guanidine isothiocyanate, Trizol reagent (Invitrogen). Five ml of an SC15847 frozen 
stock culture was thawed and used to inoculate 100 ml of F7 media in a 500 ml 

15 Erlenmeyer flask. The culture was grown in a shaker incubator at 230 rpm, 30°C for 
20 hours to an optical density at 600 nm (OD^oo) of 9.0. The culture was placed in a 
16°C shaker incubator at 230 rpm for 20 minutes. Fifty-five milligrams of epothilone 
B was dissolved in 1 ml of 100% ethanol and added to the culture. A second ml of 
ethanol was used to rinse the residual epothilone B from the tube and added to the 

20 culture. The culture was incubated at 16*'C. 230 rpm for 30 hours. Thirty ml of the 
culture was transferred to a 50 ml tube, 150 mg of lysozyme was added to the culture 
and the culture was incubated 5 nainutes at room temperature. Ten ml of the culture 
was placed in a 50 ml Falcon tube and centrifixged 5 minutes, 4°C in a TJ6 centrifuge. 
Two ml of chloroform was added and the tube was mixed vigorously for 15 seconds. 

25 The tube was incubated 2 minutes at room temperature and centrifiiged 10 minutes, 
top speed in the TJ6 centrifuge. The aqueous layer was transferred to a fi^esh tube and 
2.5 ml of isopropanol was added to precipitate the RNA. The tube was incubated 10 
minutes at room temperature and centrifiiged 10 minutes, 4*^0. The supernatant was 
removed, the pellet was rinsed with 70% ethanol and dried briefly under vacuum. The 

30 pellet was resuspended in 150 pi of RNase-free dHaO. Fifty of 7.5M LiCl was 
added to the RNA and incubated at -20''C for 30 minutes. The RNA was pelleted by 
centrifugation 10 minutes, 4°C in a microcentrifuge. The pellet was rinsed with 200 jil 
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of 70% ethanol, dried briefly under vacuum and resuspended in 150 \}1 of RNase fiee 
dHiO. 

The RNA was treated with DNasel (Ambion, Austin, Texas, USA). Twenty- 
five IJJ of total RNA (5.3 fig/jjl), 2.5 pi of DNasel buffer, 1.0 pi of DNase I added 
5 and mcubated at 3TC for 25 minutes. Five pi of DNase I inactivation buffer added, 
incubated 2 minutes, centrifiiged 1 minute, the supernatant was transferred to a fresh 
tube. 

Example 7: cDNA Synthesis 

10 cDNA was synthesized from the total RNA using the Superscript n enzyme 

(Invitrogen). The reaction was prepared with 1 pi of total RNA (5.3 pg/pl), 9 pi of 
dH20, 1 pi of dNTP mix (10 mM), and 1 pi of random hexamers. The reaction was 
incubated at 65*^C for 5 minutes then placed on ice. The following components were 
then added: 4 pi of 1^ strand buffer, 1 pi of RNase Inhibitor, 2.0 pi of 0.1 M DTT, 

15 and 1 pi of Superscript II enzyme. The reaction was incubated at room temperature 10 
minutes, 42°C for 50 nodnutes and 70''C for 15 minutes. One pi of RNaseH was added 
and incubated 20 minutes at 37''C, 15 minutes at 70°C and stored at 4°C. 

Example 8: DNA Labeling 

20 The PGR conditions used to amplify the P450 specific products from genomic 

DNA and cDNA were used to amplify the insert of plasmid pCRscript-29. Plasmid 
pCRscript-29 contains a 340bp PGR fragment amplified from SC15847 genomic 
DNA using primers P450 ^(SEQ ID NO:23) and P450 3" (SEQ ID NO:27). Two pi 
of the plasmid prep was used as a template, with a total of 25 cycles. The amplified 

25 product was gel purified using the Qiaquick gel extraction system (Qiagen). The 

extracted DNA was ethanol precipitated and resuspended in 5 pi of TE, the yield was 
estimated to be 500 ng. This fragment was labeled with digoxigenin using the chem 
link labeling reagent (Roche Molecular Biochemicals, Indianapolis, Indiana catalog 
#1 836 463). Five pi of the PGR product was mixed with 0.5 pi of Dig-chem link and 

30 dH20 added to 20 pi. The reaction was incubated 30 minutes at 85°C and 5 pi of stop 
solution added. The probe concentration was estimated at 20 ng/pl. 
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Example 9: Southern DNA Hybridization 

Ten [Jd of genomic DNA (0.5 fig/pl) was digested with BamHI, Bglll, EcoRI, 
Hindm or NotI and separated at 12 volts for 16 hours. The gel was depurinated 10 

5 minutes in 0.25 N HCl and transferred by vacuum to a nylon membrane (Roche 
Molecular Biochemicals) in 0.4 N NaOH 5" Hg , 90 minutes using a vacuum blotter 
(Bio-Rad Laboratories, Inc. Hercules, California, USA catalog # 165-5000). The 
membrane was rinsed in 1 M ammonium acetate and UV-crosslinked using the 
Stratalinker UV Crosslinker (Stratagene). The membrane was rinsed in 2X SSC and 

10 stored at room temperature. 

The membrane was prehybridized 1 hour at 4T^C in 20 ml of Dig Easy Hyb 
buffer (Roche Molecular Biochemicals). The probe was denatured 10 minutes at 65°C 
and then placed on ice. Five ml of probe in Dig-Easy Hyb at an approximate 
concentration on 20 ng/ml was incubated with the membrane at 42®C overnight. The 

15 membrane was washed 2 times in 2X SCC with 0. 1 % SDS at room temperature, then 
2 times m 0.5X SSC with 0.1% SDS at 65*'C. The membrane was equilibrated in 
Genius buffer 1 (10 mM maleic acid, 15 mM NaQ; pH 7.5; 0.3% v/v Tween 20) 
(Roche Molecular Biochemicals, Indianapolis, Indiana) for 2 minutes, then incubated 
with 2% blocking solution (2% Blocking reagent in Genius Buffer l)(Roche 

20 Molecular Biochemicals Indianapolis, Indiana) for 1 hour at room temperature. The 
membrane was incubated with a 1:20,000 dilution of anti-dig antibody in 50 ml of 
blocking solution for 30 minutes. The membrane was washed 2 times, 15 minutes 
each in 50 ml of Genius buffer 1. The membrane was equilibrated for two minutes in 
Genius Buffer 3 (lOmM Tris-HQ, lOmM NaCl; pH 9.5). One ml of a 1:100 dilution 

25 of CSPD (disodium 3-(4-methoxyspiro{ l,2-dioxetane-3,2'-(5*- 

chloro)tricyclo[3.3.1.1^'^]decan}-4-yl)phenyl phosphate) (Roche Molecular 
Biochemicals) in Genius buffer 3 was added to the membrane and incubated 5 
minutes at room temperature, then placed at 37®C for 15 minutes. The membrane was 
exposed to Biomax ML film (Kodak, Rochester, New York, USA) for 1 hour. 

30 

Example 10: E. cott Transformation 
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Competent cells were purchased from Invitrogen. E. coli strain DHIOB was 
used as a host for genomic cloning. The chemically competent cells were thawed on 
ice and 100 pi aliquoted to a 17 x 100-mm polypropylene tube on ice. One pi of the 
ligation mixture was added to the cells and incubated on ice for 30 minutes. The cells 
5 were incubated at 42°C for 45 seconds, then placed on ice 1-2 minutes. 0»9 ml pf 
SOC. medium(Invitrogen) was added and the cells were incubated one hour at 30- 
3TC at 200-240 rpm. Cells were plated on a selective mediimi (Luria agar with 
neomycin or ampicillin at a concentration of 30 fig /ml or 100 jxg /ml respectively). 

10 Example 11: Transformation of S^rqi^omjcesr fivw/a/w TK24 

Plasmid pWB19N849 was constmcted by digesting plasmid pWB19N with 
Hindin and treating with SAP I and digesting plasmid pANT849 (Keiser, et al., 2000, 
Practical Streptomyces Genetics, John hmes ) with EDmdm. The two linearized 
fragments were Ugated 1 hour at room temperature with lU of T4 DNA ligase. One ^l 

15 of the ligation reaction was used to transform XL-1 Blue electrocompetent cells 

(Stratagene). The recovered cells were plated to LB neomycin (30 |ig/ml) overnight at 
37**C. Colonies were picked to 2 ml of LB with 30 Jig/ml neomycin and incubated 
overnight at 30''C. MoBio plasmid minipreps were performed on all cultures. 
Plasmids constructed from the ligation of pWB 19N and pANT849 were determined 

20 by electrophoretic mobility on 0.7 % agarose. The plasmid pWB19N849 was digested 
with HindlH and Bgin to excise a 5.3 kb fragment equivalent to plasmid pANT849 
digested with BgUI and HindTTT. This 5.3 kb fragment was purified on an agarose gel 
and extracted using the Qiaquick gel extraction system. 

A 1 .469 kb DNA fragment containing the epothilone B hydroxylase gene and 

25 the downstream ferredoxin gene was amplified using PGR. The 50 pi PGR reaction 
was composed of 5 |ll of Taq buffer, 2.5 |il glycerol, 1 |Ld of 20 ng/|jl NPB29-1 
plasmid, 0.4 pi of 25 mM dNTPs, 1.0 |il each of primers NPB29-6F (SEQ ID NO:28) 
and NPB29-7R (SEQ ID NO:29) (5 pmole/pl), 38.1 pi of dHzO and 0.5 pi of Taq 
enzyme (Stratagene). The reactions were performed on a Perkin Elmer 9700, 95°C for 

30 5 mmutes, then 30 cycles (96**C for 30 seconds, eO^'C 30 seconds, 72X for 2 
minutes), and 72*'C for 7 minutes. The PCR product was purified using a Qiagen 
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minielute column with the PCR cleanup procedure. The purified product was digested 
with Bgin and Hindm and purified on a 0.7 % agarose geL A 1.469 kb band was 
excised from the gel and eluted using a Qiagen minielute colmnn. Five |il of this PCR 
product was ligated with 2 fil of the BglK, Hindm digested pANT849 vector in a 10 
5 ^1 ligation reaction. The reaction was incubated at room temperature for 24 hours and 
then transformed to S. Uvidans TK24 protoplasts. 

Twenty ml of YEME media was inoculated with a frozen spore suspension of 
5. Uvidans TK24 and grown 48 hours in a 125 ml bi-indent flask. Protoplasts were 
prepared as described in Practical Stieptomyces Genetics. The ligation reaction was 

10 mixed with protoplasts, then 500 |Lil of transformation buffer was added, followed 
immediately by 5 ml of P buffer. The transformation reactions were spun down 7 
minutes at 2,750 ipm, resuspended in 100 |xl of P buffer and plated to one R2YE 
plate. The plate was incubated at 28**C for 20 hours then overlaid with 5 ml of LB 
0.7% agar with 250 M-g^ml thiostrepton. After 7 days colonies were picked to an 

15 R2YE grid plate with 50 p.g/ml of thiostrepton. The colonies were grown an 
additional 5 days at 28°C, then stored at 4X. 

This recombinant microorganism has been deposited with the ATCC and 
designated PTA-4022. 

20 Example 12: Transformation of Streptomyces rimosus 
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The procedure of Kgac and Schrempf AppL Environ Microb., Vol. 61, No. 1, 
352-356 (1995) was used to transform S. rimosus. S. rimosus strain R6 593 was 
cultivated in 20 ml of CRM medium at 30 °C on a rotary shaker (250 rpm). The cells 
were harvested at 24 hrs by centrifagation for 5 minutes, 5,000 rpm, 4 °C, and 

5 resuspended in 20 ml of 10% sucrose, 4 C, and centrifuged for 5 minutes, 5,000. rpm, 
4 **C. The pellet was resuspended in 10 ml of 15% glycerol, 4 and centrifuged for 5 
minutes, 5,000 rpm. 4 The pellet was resuspended in 2 ml of 15% glycerol, 4 °C 
with 100 Jig/ml lysozyme and mcubated at 37 °C for 30 minutes, centrifuged for 5 
minutes, 5,000 rpm, 4°C and resuspended in 2 ml of 15% glycerol, 4'^C. The 15% 

10 glycerol wash was repeated once and the pellet was resuspended in 1 to 2 ml of 
Electroporation Buffer, The cells were stored at -80°C in 50 ~ 200 |j1 aliquots. 

The ligations were prepared as described for the S, lividans transformation. 
After the incubation of the ligation reaction, the volume was brought to 100 |j1 with 
dH20, NaCl was added to 0.3M, and the reaction extracted with an equal volume of 

IS 24: 1 : 1 phenolrchoroform isoamyl alcohol. Twenty |ig of glycogen was added and the 
ligated DNA was precipitated with 2 volumes of 100% ethanol at -20 °C for 30 
minutes. The DNA was pelleted 10 minutes in a microcentrifuge, washed once with 
70% ethanol, dried 5 minutes in a speed-vac concentrator and resuspended in 5 (ll of 
dHzO. 

. 20 One fix^zen aliquot of cells was thawed at room temperature and divided, 50 

\sM tube for each DNA sample for electroporation. The cells were stored on ice until 
use. DNA in 1 to 2 iJl of dHaO was added and mixed. The cell and DNA mixture was 
transferred to a 2 mm gapped electrocuvette (Bio-Rad Laboratories, Richmond 
CaUfomia USA) that was pre-chilled on ice. The cells were electroporated at a setting 

25 of 2 kV (lOkV/cm), 25\iP, 400 SI using a Gene Pulser™ (Bio-Rad Laboratories). The 
cells were diluted with 0.75 to 1.0 ml of CRM (0-4 ''C), transferred to 15 ml culture 
tubes and incubated with agitation 3hrs at 30 The cells were plated on trypticase 
soy broth agar plates with 10-30 [ig/ml of fliiostrepton. 

30 Example 13: High Performance liquid chromatography 

The liquid chromatography separation was performed using a Waters 2690 
Separation Module system (Waters Corp., Milford, MA, USA) and a column, 4.6 x 
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150 mm, filled with SymmetryShield RPg, particle size 3.5 \im (Waters Corp., 
Milford, MA, USA). The gradient mobile phase programming was used with a flow 
rate of 1.0 ml/minute. Eluent A was watex/acetonitrile (20:1) + 10 mM ammonium 
acetate. Eluent B was acetonitrile/water (20:1). The mobile phase was a linear 
5 gradient £rom 12% B to 28 % B over 6 minutes and held isocratic at 28% B over 4 
minutes. This was followed by a 28% B to 100% B linear gradient over 20 minutes 
and a linear gradient to 12% B over two minutes with a 3 minute hold at 12% B. 

Example 14: Mass spectrometry 

10 The column effluent was introduced directly into the electrospray ion source of a 

ZMD mass spectrometer (Micromass, Manchester, UK). The instrument was calibrated 
using Test Juice reference standard (Waters Corp, Milford, MA, USA) and was delivered 
at a flow of 10 jll/minute from a syringe pump (Harvard Apparatus, Holliston, MA, 
USA). The mass spectrometer was operated at a low mass resolution of 13.2 and a high 

15 mass resolution of 1 L2. Spectra were acquired from using a scan range of m/z 100 to 
600 at an acquisition rate of 10 spectra /second. The ionization technique employed was 
positive electrospray (ES). The sprayer voltage was kept at 2900 V and the cone voltage 
of the ion source was kept at a potential of 17 V. 

20 Example 15: Use of the ebh gene sequence (SEQ ID NO:l) to isolate cytochrome 
P450 genes from other microorganisms 

Genomic DNA was isolated from a set of cultures (ATCC43491, 
ATCC14930, ATCC53630, ATCC53550, ATCC39444, ATCC43333, ATCC35165) 
using the DNAzol reagent. The DNA was used as a template for PGR reactions using 

25 primers designed to the sequence of the ebh gene. Three sets of primers were used for 
ampUfication; NPB29-6f (SEQ ID NO:28) and NPB29-7r (SEQ ID NO:29), NPB29- 
16f (SEQ ID NO:50) and NPB29-17r (SEQ ID N0:51), and NPB29-19f (SEQ ID 
NO:52) and NPB29-.20r (SEQ ID NO:53). 

PGR reactions were prepared in a volume of 20 fil, containing 200-500 ng of 

30 genomic DNA and a forward and reverse primer. All primers were added to a final 
concentration of 1.4- 2.0 fiM. The PGR reaction was prepared with 0.2 fil of 
Advantage™ 2 Taq enzyme (ED Biosciences Glontech, Palo Alto, Galifomia, USA), 
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2 Ml of Advantage™ 2 Taq buffer and 0.2 pi of 2.5 mM of dNTPs with dHzO to 20 
Hl. The cycling reactions were perfonned on a Geneamp® 9700 PCR system or a 
Mastercycler® gradient (Eppendorf, Westbury, New York, USA) with the following 
protocol: 95^C for 5 minutes, 35 cycles (96**C 20 seconds, 54-69**C 30 seconds. 72^*0 

5 2 minutes), IT^C for 7 miautes. The expected size of the PCR products is 

approximately 1469 bp for the NPB29-6f (SEQ ED NO:28) and NPB29-7r (SEQ ID 
NO:29) primer pair, 1034 bp for die NPB29-16f (SEQ ID NO:50) and NPB29-17r 
(SEQ ID N0:51) primer pair and 1318 bp for the NPB29-19f (SEQ ID NO:52) and 
NPB29-'20r (SEQ ID NO:53) primer pair. The PCR reactions were analyzed on 0.7% 

10 agarose gels. PCR products of the expected size were excised from the gel and 
purijBed using the Qiagen gel extraction method. The purified products were 
sequenced using the Big-Dye sequencing kit and analyzed using an ABB 10 
sequencer. 
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Example 16: Construction of plasmid pPCRscript-^M 

A 1.469 kb DNA fragment containing the epothilone B hydroxylase gene and 
the downstream ferredoxin gene was amplified using PCJL The 50 |j1 PGR reaction 
was composed of 5 lul of Taq buffer, 2.5 |j1 glycerol, 1 |il of 20 ng/\Jd NPB29-1 
5 plasmid, 0.4 |il of 25 mM dNTPs, 1.0 \il each of primers NPB29-6f (SEQ ID NO:28) 
and NPB29-7r (SEQ ID NO:29) (5 pmole/nl), 38.1 |Jl of dHaO and 0.5 pi of Taq 
enzyme (Stratagene). The reactions were performed on a Geneamp® 9700 PCR 
system, with the following conditions; 95®C for 5 nodnutes, then 30 cycles (96®C for 
30 seconds, eO^'C 30 seconds, 72*'C for 2 minutes), and 72''C for 7 minutes. The PGR 

10 product was purified using a Qiagen Qiaquick column with the PCR cleanup 

procedure. The purified product was digested with BgUI and Hindm and purified on a 
0.7 % agarose gel. A 1 .469 kb band was excised fix)m the gel and eluted using a 
Qiagen Qiaquick gel extraction procedure. The fragments were then cloned into the 
pPCRscript Amp vector using the PCRscript Amp cloning kit. Colonies containing 

15 inserts were picked to 1-2 ml of LB (Luria Broth) with 100 fig/ml ampicillin, 30- 
37°C, 16-24 hours, 230-300 rpm. Plasmid isolation was performed using the Mo Bio 
miniplasmid prep kit. The sequence of the insert was confirmed by cycle sequencing 
with the Big-Dye sequencing kit. This plasmid was named pPCRscnpt-ebh. 

20 Example 17: Mutagenesis of the ebh gene for improved yield or altered 
specificily 

The Quikchange® XL Site-Directed Mutagenesis Kit and the Quikchange® 
Multi Site-Directed Mutagenesis kit, both from Stratagene were used to introduce 
mutations in the coding region of the ebh gene. Both of these methods employ DNA 

25 primers 35-45 bases in length containing the desired mutation (SEQ ID Nd:54-59 and 
71), a methylated circular plasmid template and PfuTurbo® DNA Polymerase (U.S. 
Patent Nos 5,545,552 and 5,866,395 and 5,948,663) to generate copies of the plasmid 
template incorporating the mutation carried on the mutagenic primers. Subsequent 
digestion of the reaction with the restriction endonuclease enzyme Dpnl, selectively 

30 digests the methylated plasmid template, but leaves the non-methylated mutated 

plasmid intact The nMnufacturer*s instractions were followed for all procedures with 
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the exception of the Dpnl digestion step in which the incubation time was increased 
jfrom 1 hr to 3 his. The pPCRscript-ei^/i vector was used as the template for 
mutagenesis. 

One to two |il of the reaction was transformed to either XLl-Blue® 

5 electrocompetent or XLIO-Gold® ultracompetent cells (Stratagene). Cells were plated 
to a density of greater than 100 colonies per plate on LA (Luria Agar) 100 |Jg/ml 
ampicillin plates, and incubated 24-48 his at 30-37**C. The entire plate was 
resuspended in 5 ml of LB containing 100 fxg/ml ampicillin. Plasmid was isolated 
directly from the resuspended cells by centrifuging the cells and then purifying the 

10 plasmid using the Mo Bio miniprep procedure. This plasmid was then used as a 
template for PGR with primers NPB29-6f (SEQ ID NO:28)and NPB29-7r (SEQ ID 
NO:29) to amplify a mutated expression cassette. Digestion of the 1.469 kb PGR 
product with the restriction enzymes BgUI and HindHI was used to prepare this 
fragment for ligation to vector pANT849 also digested with Bgin and HindUI, 

15 Altematively, the resuspended cells were used to inoculate 20- 50 ml of LB 

containing 100 p.g/ml ampicillin and grown 18-24 hrs at 30-37°C. Qiagen midi-preps 
were performed on the cultures to isolate plasmid DNA containing the desired 
mutation. Digestion with the restriction enzymes BglU and HindTT was used to excise 
the mutated expression cassette for ligation to BgUI and HindlTT digested plasmid 

20 pANT849. Screening of mutants was performed in S. lividans or 5. rimosus as 
described. 

Altematively, the method of Leung et cd., Techniqu e- A Journal of Methods in 
Cell and Molecular Biology. Vol. 1, No. 1, 11-15 (1989) was used to generate random 
mutation libraries of the ebh gene. Manganese and/or reduced dATP concentration is 

25 used to control the mutagenesis fi:equency of the Taq polymerase. The plasmid 

pCRscript-eM was digested with NotI to linearize the plasmid. The Polymerase buffer 
was prepared with 0.166 M (NH4)2S04, 0.67M Tris-HGl pH 8.8, 61 mM MgGk, 67 
jiM EDTA pH8.0, 1 .7 mg/ml Bovine Serum Albumin). The PGR reaction was 
prepared with 10 jil of Not I digested pGRscript-e6/i (O.lng/pl), 10 |xl of polymerase 

30 buffer, 1 .0 III of IM P-mercaptoethanol, 10.0 pi of DMSO, 1.0 jil of NPB29-6f (SEQ 
ID NO:28) primer (100 pmole/|Jl), 1.0 |il of NPB29-7r (SEQ ID NO:29) primer (100 
pmole/|il), 10 Ml of 5 mM MnQz, 10.0 fillO mM dGTP. 10.0 pi 2 mM dATP, 10 mM 
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dTTP, 10.0 jJl 10 mM dCTP, and 2.0 \d Taq polymerase. dH20 was added to 100 pi. 
Reactions were also prepared as described above but without MnQ^ The cycling 
reactions were performed a GeneAmp® PGR system witii the following protocol: 
95^*0 for 1 minute, 25-30 cycles (94 °C for 1 minute, 55 °C for 30 seconds, 72 °C for 4 
5 minutes), 72 **C for 7 minutes. The PGR reactions were separated on an agarose gel 
using a Qiagen spin column. The fragments were then digested with Bgin and HindUI 
and purified using a Qiagen spin column. The purified fragments were then Ugated to 
Bgin and Hindm digested pANT849 plasmids. Screening of mutants was performed 
in S. lividans and S. rimosus, 

10 



Table of Characterized Mutants 



Mutant 


Position 


Substitution 


Wild-type 


ebh2A-l6 


92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 


ebhlSA 


195 


Serine 


Asparagine 




294 


Proline 


Serine 




ion 


Tyrosine 


Phenylalanine 




231 


Arginine 


Glutamic acid 




92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 




67 


Glutamine 


Arginine 


ebh2A-16cll 


92 


Valine 


Isoleucine 




93 


Glycine 


Alanine 




237 


Alanine 


Phenylalanine 




365 


Threonine 


Isoleucine 


ebh2A-16-16 


92 


Valine 


Isoleucine 




106 


Alanine 


Valine 




237 


Alanine 


Phenylalanine 


ebh2A-16-74 


88 


Histidine 


Arginine 




92 


Valine 


Isoleucine 




237 


Alanine 


Phenylalanine 


ebh-MlS 


31 


Lysine 


Glutamic acid 




176 


Valine 


Methionine 


ebh2A-16gS 


92 


Valine 


Isoleucine 
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237 
67 
130 
176 



Alanine 
Glutamine 
Threonine 
Alanine 



Phenylalanine 
Arginine 
boleucine 
Methionine 



ebh24-16b9 



92 

237 

67 

140 

176 



Valine 

Alanine 

Glutamine 

Threonine 

Serine 



Isoleucine 

Phenylalanine 

Arginine 

Alanine 

Methionine 



Example 18: Comparison of epothilone B transformation in cells expressing ebh 
and mutants thereof 

5 In these experiments, twenty ml of YEME medium in a 125 ml bi-indented 

flask was inoculated with 200 pi of a frozen spore preparation of 5. liyidans TK24, S. 
lividans (pANT849-eMX S. lividans (pANT849-e&ftlO-53) or & lividans (pANT849- 
eWi24-16) and incubated 48 hours at 230 rpm, 30**C. Thiostrepton, 10 fig/^ml was 
added to media inoculated with S. lividans (pANT849-^Wi), S. lividans (pANT849- 

10 e6^10-53) and S. lividans (pANT849-eZ>/i24-16). Four ml of culture was transfened to 
20 ml of RS mediimi in a 125 ml Erlenmeyer flask and incubated 18 hrs at 230 rpm, 
30*^0. Epothilone B in 100% EtOH was added to each culture to a final concentration 
of 0.05% weight/volume. Samples were taken at 0, 24, 48 and 72 hours with the 
exception of the iS. lividans (pANT849-eM24-16) culture, in which the epothilone B 

15 had been completely converted to epothilone F at 48 hours. The samples were 
analyzed by HPLC. Results were calculated as a percentage of the epothilone B at 
time 0 hours. 
Epothilone B: 



Time (hours) 


TK24 


pANT849-«Wi 


pANT849-«M10-53 


pANT849-c6/(24-16 


0 


100% 


100% 


100% 


100% 


24 


99% 


78% 


69% 


56% 


48 


87% 


19% 


39% 


0% 


72 


87% 


0% 


3% 





20 EpothiloDe F: 

Time (hours) ^4 pANT849-«6ft pANT849-eM10-53 pANT849-«M24-16 
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0 0% 0% 0% 0% 

24 0% 4% 9% 23% 

48 0% 21% 29% 52% 

72 0% 14% 41% — 



Alternatively, the bioconversion of epothilone B to epothilone F was 
peifoimed in S. rimosus host cells transfonned with expression plasmids containing 
the ebh gene and its variants or mutants. One-hundred |il of a frozen S. riniosus 

5 transformant culture was inoculated to 20 ml CRM media with 10 |lg/ml thiostrepton 
and cultivated 16-24 hr, 30°C, 230- 300 rpm. Epothilone B in 100% ethanol was 
added to each culture to a final concentration of 0.05% weight/volume. The reaction 
was typically incubated 20- 40hrs at 30 °C, 230-300 rpm. The concentration of 
epothilones B and F was determined by HPLC analysis. 

10 Evaluation of mutants in S, runosus 



Mutant 


EpothUone F yield 




55% 


ebhOA-im 


75% 


ebh2A-16cll 


75% 


ebh2A-l6-16 


75% 


ebh2A-l6-74 


75% 


ebh2A-l&>9 


80% 


ebK2A-16gB 


85% 



Example 19: Biotransfonnation of compactin to pravastatin 

Twenty ml of R2YB media with 10 ^ig/ml thiostrepton in a 125 ml flask was 
inoculated with 200 pJ of a frozen spore preparation of S, lividans (pANT849), S. 

15 lividans (pANT849-e&A) and incubated 72 hours at 230 rpm, 28*'C. Four ml of culture 
was inoculated to 20 ml of R2YE media and grown 24 hours at 230 rpm, 28°C. One 
ml of culture was transferred to a 15 ml polypropylene culture tube, 10 pi of 
compactin (40 mg/ml) was added to each culture and incubated for 24 hours, 28®C, 
250 rpm. Five hundred pi of the culture broth was transferred to a fresh 15 ml 

20 polypropylene culture tube. Five hundred pi of 50 mM sodium hydroxide was added 
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and vortexed. Three ml of methanol was added and vortexed, the tube was 
centrifuged 10 minutes at 3000 rpm in a TJ-6 table-top centrifuge. The organic phase 
was analyzed by HPLC. Compactin and pravastatin values were assessed relative to 
the control S. lividans (pANT849) cultmre. 



Compactin and Pravastatin as a Percent^e of Starting Compactin 
Concentration: 





S. Imdans (jJ ANT849) 


S. lividans (pANT849-<i>A) 


Compactin 


36% 


11% 


Pravastatin 


11% 


53% 



Example 20: High performance liquid chromatography method for compactin and 
10 pravastatin detection 

The liquid chromatography separation was performed using a Hewlett 
Packardl090 Series Separation system (Agilent Technologies, Palo Alto, California, 
USA) and a column, 50x46 mm, filled with Spherisorb ODS2, particle size 5 |im 
(Keystone Scientific, Ihc, Bellefonte, Pennsylvania, USA). The gradient mobile 
15 phase programming was used with a flow rate of 2.0 ml/minute. Eluent A was water, 
10 mM anomonium acetate and 0.05% Phosphoric Acid. Eluent B was acetonitrile. 
The mobile phase was a linear gradient from 20% B to 90 % B over 4 minutes. 

Example 21: Structure determination of the biotransformation product of 
20 mutant ebhlS-l 

Analytical HPLC was performed using a Hewlett Packard 1100 Series Liquid 
Chromatograph with a YMC Packed ODS-AQ column, 4.6 mm i.d. x 15 cm 1. A 
gradient system of water (solvent A) and acetonitrile (solvent B) was used: 20% to 
90% B linear gradient, 10 minutes; 90% to 20% linear gradient, 2 minutes. The flow 
25 rate was 1 ml/minute and UV detection was at 254 nm. 

Preparative HPLC was perfonned using the following equipment and 
conditions: 

Pump: Varian ProStar Solvent Delivery Module (Varian Inc., Palo Alto, California, 
USA). Detector: Gynkotek UVD340S. 
30 Column: YMC ODS-A column (30mmID X 100 mm length, 5fi particle size). 
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Elutioa flow rate: 30 ml/minute 

Elutioa gradient: (solvent A: water; solvent B: acetonitrile), 20% B, 2 minutes; 20% 
to 60% B linear gradient, 18 minutes; 60% B, 2 minutes; 60 % to 90% B linear 
gradient, 1 minute; 90 % B, 3 minutes; 90 % to 20% B linear gradient, 2 minutes. 
5 Detection: UV, 210 nm. ^ 

LCVNMR was performed as follows: 40 (il of sample was injected onto a 
YMC Packed ODS-AQ column (4.6 mm Ld. x 15 cm 1). The column was eluted at 1 
ml/minute flow rate with a gradient system of D2O (solvent A) and acetonitrile-ds 
(solvent B): 30% B, 1 minute; 30% to 80% B linear gradient, 11 minutes. The eluent 

10 passed a UV detection cell (monitored at 254 nm) before flowing through a F19/H1 
NMR probe (60 pi active volume) in Varian AS-600 NMR spectrometer. The 
biotransformation product was eluted at around 7.5 minutes and the flow was stopped 
manually to allow the eluent to remain in the NMR probe for NMR data acquisition. 
Isolation and analvsis was performed as follows. The butanol/methanol extract 

15 (about 10 ml) was evaporated to dryness under nitrogen stream. One ml methanol 
was added to the residue (38 mg) and insoluble material was removed by 
centrifiigation (13000 rpm, 2 min). 0,1 ml of the supernatant was used for LC/NMR 
study and the rest of 0.9 ml was subjected to the preparative HPLC (0.2-0.4 ml per 
injection). Two major peaks were observed and collected: peak A was eluted between 

20 14 and 15 minutes, while peak B was eluted between 16.5 and 17.5 minutes. 

Analytical HPLC analysis indicated that peak B was the parent compound, epothilone 
B (Rt 8.5 minutes), and peak A was the biotransformation product (Rt 7.3 minutes). 
The peak A fractions were pooled and MS analysis data was obtained with the pooled 
fractions. The pooled fraction was evaporated to a small volume, then was 

25 lyophilized to give 3 mg of white solid. NMR and HPLC analysis of the white solid 
(dissolved in methanol) revealed that the biotransformation product was partially 
decomposed during the drying process. 
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APPENDIX 1 



Atom No. 


Residue 


Atom Name 


X-coord 


Y-coord 


Z-coord 


1 


ALA9 


N 


-39.918 


-4.913 


-1.651 


2 


ALA9 


CA 


-38.454 


-5.033 


-1.537 


3 


ALA9 


c 


-37.953 


-4.886 


-0.099 


4 


ALA9 


o 


-38.625 


-4.31 


0.765 


5 


ALA9 


CB 


-37.809 


-3.967 


-2.415 


6 


THR10 


N 


-36.781 


-5.447 


0,146 


7 


THR10 


CA 


-36.187 


-5.437 


1.49 


8 


THR10 


c 


-34.916 


-4.585 


1.553 


Q 


THR10 


0 


-34.016 


-4.735 


0.72 


10 


THR10 


CB 


-35.871 


-6.887 


1.846 


11 


THR10 


OG1 


-37.075 


-7.631 


1.717 


IP 


THR10 




-35.355 


-7.053 


3.271 


1*^ 


1 PI H 1 


N 


-34 858 


-3 699 


2 536 


If 


1 PU1 1 


HA 


.'^ fi8Q 


-p 853 


2.745 


lO 


1 PI 11 1 
LCU1 1 


p 


-3P 'ill 




3 353 


ID 


1 PI 11 1 


n 




-4 4fiR 


4 P5Q 

*T.C^WW 


1 / 


1 PI 11 1 




n35i 


-1 707 

— 1 . / w # 


3 Rft7 

w.wO / 




1 PI 11 1 




-wO.w/ v7 


-O 7ft 


3 07ft 

w.w/ O 




1 PI 11 1 


pni 


-Ow.Ow 


0 PRR 


4 0Q1 

t.ww 1 


on 


1 PI 111 




-0*T.wwO 


-0 111 
-w. Ill 


1 81 

1 .w 1 


oi 
^1 




IN 


OI op 

~0 1 .w^ 


-3 4PP 


P ftP3 




nnU lii 


PA 


on 191 
-OU. 1^ 1 


.4 1 1Q 


3 30P 






V/ 




-3 ROR 


4 663 


OA 




V-/ 


^C7.00w 


-P 3Q7 


4.918 




PRm 0 


PR 


-9Q nfti 




P P5Q 






pr5 


-9Q 'WT 

fcw ■ v/w / 


-P 771 


1.309 


OT 




pn 


-31 n3i 

-w i • VO 1 


-P 4Q3 


1 729 


C.O 


1 PI 11^ 
LEILI to 


M 


-PQ P7ft 

£w.£f O 


-4 5PP 


5 54 

W.W'T 


OCk 


1 PI H'^ 


PA 


-PR R7R 


-4.118 


6.819 




1 Plll*^ 


p 


-P7 1fi3 

Cml m lOW 


-3.88 


6.627 


^1 


1 PU13 


o 

v/ 


.PR AAQ 

fcW.'I'l w 


-4.806 


6.267 




1 PU13 


PR 


-28.898 


-5.196 


7,872 


33 


LEU 13 


CG 


-30.374 


-5.354 


8.217 


34 


LEU 13 


GDI 


-30.587 


-6.516 


9.181 


35 


LEU13 


CD2 


-30.945 


-4.067 


8.802 


36 


ALA14 


N 


-26.72 


-2.741 


7.112 


37 


ALA14 


CA 


-25.355 


-2.266 


6.825 


38 


ALA14 


c 


-24.244 


-2.941 


7.634 


39 


ALA14 


0 


-23.058 


-2,719 


7.372 


40 


A1-A14 


CB 


-25.311 


-0.764 


7.075 


41 


ARG15 


N 


-24.628 


-3.792 


8.569 


42 


ARG15 


CA 


-23.664 


-4.537 


9.379 


43 


ARG15 


c 


-23.478 


-5.983 


8.91 


44 


ARG15 


0 


-22.815 


-6.767 


9.599 


45 


ARG15 


CB 


-24.174 


-4.519 


10.81 


46 


ARG15 


CG 


-25.655 


-4.879 


10.84 


47 


ARG15 


CD 


-26.2 


-4.843 


12.26 


48 


ARG15 


NE 


-27.657 


-5.039 


12.256 


49 


ARG15 


cz 


-28.358 


-5.301 


13.36 


50 ^ 


ARG15 


NH1 


-29.69 


-5.376 


13.3 


51 


ARG15 


NH2 


-27.735 


-5.412 


14.536 


52 


LYS16 


N 


-24.096 


-6.351 


7.798 



-56- 



wo 2004/061116 



PCTAJS2003/034082 



53 


LYS16 


CA 


-24.016 


-7.741 


7.335 


54 


LYS16 


C 


-22.639 


-8.128 


6.807 


55 


LYS16 


0 


-21.959 


-7.359 


6.115 


56 


LYS16 


CB 


-25.061 


-7.977 


6.252 


57 


LYS16 


CG 


-26.466 


-7.985 


6.839 


58 


LYS16 


CD 


-26.605 


-9.079 


7.892 


59 


LYS16 


CE 


-28.002 


-9.092 


8.499 


60 


LYS16 


NZ 


-28.113 


-10.128 


9,537 


61 


CYS17 


N 


-22.317 


-9.392 


7.036 


62 


CYS17 


CA 


-21.061 


-10.004 


6.56 


63 


CYS17 


C 


-20.737 


-9.771 


5.066 


64 


CYS17 


0 


-19.662 


-9.205 


4.833 


65 


CYS17 


CB 


-21.096 


-11.501 


6.864 


66 


CYS17 


SG 


-21.33 


-11.937 


8.602 


67 


PR018 


N 


-21.635 


-10.003 


4.1 


68 


PR018 


CA 


-21.293 


-9.756 


2.683 


69 


PR018 


C 


-21.123 


-8.291 


2.246 


70 


PR018 


0 


-21.013 


-8.061 


1.036 


71 


PR018 


CB 


-22.388 


-10.383 


1.878 


72 


PR018 


CG 


-23.509 


-10.812 


2.802 


73 


PR018 


CD 


-23.002 


-10.554 


4.207 


74 


PHE19 


N 


-21.137 


-7.33 


3.162 


75 


PHE19 


CA 


-20.792 


-5.947 


2.834 


76 


PHE19 


C 


-19.279 


-5.777 


2.788 


77 


PHE19 


0 


-18.789 


-4.92 


2.036 


78 


PHE19 


CB 


-21.36 


-5.007 


3.894 


79 


PHE19 


CG 


-22.8 


-4.568 


3.654 


80 


PHE19 


CD1 


-23.051 


-3.27 


3.232 


81 


PHE19 


CD2 


-23.856 


-5.444 


3.867 


82 


PHE19 


CE1 


-24.355 


-2.853 


3.003 


83 


PHE19 


CE2 


-25.159 


-5.03 


3.629 


84 


PHE19 


cz 


-25.409 


-3.735 


3.197 


85 


SER20 


N 


-18.573 


-6.687 


3.449 


86 


SER20 


CA 


-17.102 


-6.717 


3.446 


87 


SER20 


C 


-16.569 


-7.839 


4.342 


88 


SER20 


0 


-16.632 


-7.723 


5.573 


89 


SER20 


CB 


-16.557 


-5.371 


3.929 


90 


SER20 


OG 


-17.236 


-5.019 


5.129 


91 


PR021 


N 


-15.974 


-8.867 


3.753 


92 


PR021 


CA 


-15.978 


-9.134 


2.304 


93 


PR021 


C 


-17.267 


-9.836 


1.856 


94 


PR021 


0 


-18.026 


-10.327 


2.702 


95 


PR021 


CB 


-14.8 


-10.047 


2.111 


96 


PR021 


CG 


-14.442 


-10.669 


3.455 


97 


PR021 


CD 


-15.306 


-9.949 


4.481 


98 


PR022 


N 


-17.551 


-9.859 


0.561 


99 


PR022 


CA 


-16.897 


-9.007 


-0.445 


100 


PR022 


C 


-17.4 


-7.575 


-0.296 


101 


PR022 


0 


-18.341 


-7.371 


0.469 


102 


PR022 


CB 


-17.32 


-9.591 


-1.762 


103 


PR022 


CG 


-18.478 


-10.549 


-1,528 


104 


PR022 


CD 


-18.669 


-10.604 


-0.021 


105 


PR023 


N 


-16.687 


-6.605 


-0.842 


106 


PR023 


CA 


-17.224 


-5.241 


-0.897 
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107 


PR023 


0 


-18.525 


-5.21 


-1.693 


108 


PR023 


O 


-18.524 


-5.083 


-2.925 


109 


PR023 


CB 


-16.159 


-4.417 


-1.547 


110 


^^^^^^^^^^ 

PR023 


CG 


-15.004 


-5.321 


-1.95 


111 


PR023 


CD 


-15.388 


-6.725 


-1.509 


112 


GLU24 


N 


-19.62 


-6.122 


-0.956 


113 


GLU24 


CA 


-20.963 


-5.192 


-1.547 


114 


GLII24 


C 


-21.415 


-3.843 


-2.088 


115 


GLU24 


O 


-22.323 


-3.794 


-2.93 


116 


GLU24 


CB 


-21.934 


-5,68 


-0.48 


117 


GLU24 


CG 


-23.27 


-6.137 


-1.052 


118 


GLU24 


CD 


-23.982 


-7.017 


-0.024 


119 


GLU24 


OE1 


-24.613 


-7.981 


-0.433 


120 


GLU24 


OE2 


-23.833 


-6.745 


1.158 


121 


TYR25 


N 


-20.573 


-2.843 


-1.878 


122 


TYR25 


CA 


-20.842 


-1.47 


-2.303 


123 


TYR25 


C 


-20.704 


-1.311 


-3.816 


124 


TYR25 


0 


-21.364 


-0.436 


-4.385 


125 


TYR25 


CB 


-19.828 


-0.568 


-1.608 


126 


TYR25 


CG 


-19.616 


-0.882 


-0.128 


127 


TYR25 


CD1 


-20.662 


-0.753 


0.779 


128 


TYR25 


CD2 


-18.364 


-1.298 


0.311 


129 


TYR26 


CE1 


-20.461 


-1.062 


2.119 


130 


TYR25 


CE2 


-18.163 


-1.605 


1.65 


131 


TYR25 


C2 


-19.213 


-1.492 


2.55 


132 


TYR25 


OH 


-19.026 


-1.859 


3.866 


133 


GLU26 


N 


-20.1 


-2.296 


-4.468 


134 


GLU26 


CA 


-20.009 


-2.293 


-5.928 


135 


GLU26 


C 


-21.404 


-2.483 


-6.52 


136 


GLU26 


0 


-21.92 


-1.572 


-7.177 


137 


GLU26 


CB 


-19.129 


-3.454 


-6.39 


138 


GLU26 


CG 


-17.813 


-3.593 


-5.628 


139 


GLU26 


CD 


-16.94 


-2.342 


-5.707 


140 


GLU26 


0E1 


-16.345 


-2.12 


-6.749 


141 


GLU26 


OE2 


-16.773 


-1.731 


-4.657 


142 


ARG27 


N 


-22.105 


-3.488 


-6.017 


143 


ARG27 


CA 


-23.437 


-3.805 


-6.538 


144 


ARG27 


C 


-24.504 


-2.909 


-5.921 


145 


ARG27 


o 


-25.496 


-2.591 


-6.59 


146 


ARG27 


CB 


-23.752 


-5.26 


-6.22 


147 


ARG27 


CG 


-22.7 


-6.189 


-6.812 


148 


ARG27 


CD 


-23.031 


-7.653 


-6.55 


149 


ARG27 


NE 


-23.146 


-7.926 


-5,108 


150 


ARG27 


cz 


-22.251 


-8.648 


-4.428 


151 


ARG27 


NH1 


-21.16 


-9.11 


-5.043 


152 


ARG27 


NH2 


-22.428 


-8.879 


-3.126 


153 


LEU28 


N 


-24.197 


-2.331 


-4.771 


154 


LEU28 


CA 


-25.11 


-1.358 


-4.168 


155 


LEU28 


C 


-25.131 


-0.079 


-4.987 


156 


LEU28 


O 


-26.214 


0.286 


-5.45 


157 


LEU28 


CB 


-24.67 


-1.039 


-2.746 


158 


LEU28 


CG 
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ILE377 


CGI 


9.836 


-8.352 


5.203 


2865 


ILE377 


CG2 


7.529 


-9.169 


5.674 


2866 


ILE377 


CD1 


9.433 


-6.919 


5 526 


2867 


ARG378 


N 


7.743 


-12.55 


4.009 


2868 


ARG378 


CA 


6.648 


-13.342 


3.451 


2869 


ARG378 


c 


5.468 


-13.417 


4 409 


2870 


ARG378 


o 


5.629 


-13.304 


5 627 


2871 


ARG378 


CB 


7.186 


-14.734 


3 163 


2872 


ARG378 


CG 


8.265 


-14.671 


9 DRQ 


2873 


ARG378 


CD 


8.975 


-16.01 


1 Q9Q 


2874 


ARG378 


NE 


9 7Sfi 




'\ 1*^4 


2875 


ARG378 


CZ 


Q "57 


-17 AR1 


Rft4 
O.ODH 


2876 


ARG378 


NH1 


R *iR7 


-1R 9R 




2877 


ARG37a 


Ivl tc 




-17 R'^Q 


4 Q'^l 


2878 


ILE379 


N 

MM 


d 077 


_iO CQ-I 
" lO.OO 1 


0.004 


2879 


ILE379 


CA 


0.V90 


- IO.# lO 


4 7nQ 


2880 


ILE379 


c 


14. 


- 1 0. lO 




2881 


IL E379 


o 


O.O 1 s7 


-1ft 07 


4 Rft^ 


2882 


ILE37Q 




1 •O'r 1 


_10 CQA 


O.oOO 


2883 




CGI 




.19 RRQ 


9 ftQ9 


2884 


ILE379 






-19 Qfl4. 


4 7nQ 


2885 


ILE379 


GDI 




-1 9 447 


1 7QR 


2886 




IN 


O R70 


-1 9ft7 


O.OD 


2887 


ALA380 


CA 




-1R '^QR 


7 174 


2888 




n 


1 .ooo 


-17 977 


7 n7 


2889 


ALAsao 


n 




-1R R1 


7 007 


2890 


ALA380 

#»L— /»Vi* W 


CB 




.1 ft 471 


ft ft'^c^ 


2891 


VAL381 


N 


n AQR 


-1ft 47 


ft 017 


2892 


VAL381 


CA 




-17 noR 


ft ft>\1 


2893 


VAL381 


c 


-1 9*^1 


-1ft 74R 


C 9 


2894 


VAL381 


o 


W. / Vile. 


-1 7ftft 


4 ^O 


2895 


VAL381 


CB 


-1 RAT 


-1ft '^'^Q 


7 KQO 


2896 


VAL381 


CGI 


-1 70^ 


-Ifi RQ7 

1 O.Ov7/ 


Q 01R 
y.u i 0 


2897 


VAL381 


CG2 


-1 .747 


-14 ft.'^Q 


7 R7 


2898 


PR0382 


N 


-1 .999 


-17 669 


4 ft'^S 


2899 


PR0382 


CA 


-2.615 


A1A2A 


3.329 


2900 


PR0382 


0 


-3.477 


-16.166 


3 352 


2901 


PR0382 


0 


-4.045 


-15.802 


4.391 


2902 


PR0382 


CB 


-3.422 


-18.651 


3.039 


2903 


PR0382 


CG 


-3.29 


-19.627 


4.198 


2904 


PR0382 


CD 


-2.414 


-18.938 


5.231 


2905 


VAL383 


N 


-3.721 


-15.621 


2.172 


2906 


VAL383 


CA 


-4.415 


-14.327 


2.051 


2907 


VAL383 


c 


-5.892 


-14.388 


2.452 


2908 


VAL383 


0 


-6.376 


-13.473 


3.126 


2909 


VAL383 


CB 


- -4.302 


-13.886 


0.593 


2910 


VAL383 


CGI 


-5.05 


-12.578 


0.343 


2911 


VAL383 


CG2 


-2.838 


-13.751 


0.177 


2912 


ASP384 


N 


-6.478 


-15.572 


2.355 


2913 


ASP384 


CA 


-7.876 


-15.767 


2.759 


2914 


ASP384 


C 


-8.031 


-15.962 


4.271 
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2915 ASP384 O 

2916 ASP384 CB 

2917 ASP384 CG 

2918 ASP384 OD1 

2919 ASP384 0D2 

2920 GLU385 N 

2921 GLU385 CA 

2922 GLU385 C 

2923 GLU385 O 

2924 GLU385 CB 

2925 GLU385 CG 

2926 GLU385 CD 

2927 GLU385 0E1 

2928 GLU385 OE2 

2929 LEU386 ' N 

2930 LEU386 CA 

2931 LEU386 C 

2932 LEU386 O 

2933 LEU386 CB 

2934 LEU386 CG 

2935 LEU386 CD1 

2936 LEU386 CD2 

2937 PR0387 N 

2938 PR0387 CA 

2939 PR0387 C 

2940 PR0387 O 

2941 PR0387 CB 

2942 PR0387 CG 

2943 PR0387 CD 

2944 PHE388 N 

2945 PHE388 CA 

2946 PHE388 C 

2947 PHE388 O 

2948 PHE388 CB 

2949 PHE388 CG 

2950 PHE388 CD1 

2951 PHE388 CD2 

2952 PHE388 CE1 

2953 PHE388 CE2 

2954 PHE388 CZ 

2955 LYS389 N 

2956 LYS389 CA 

2957 LYS389 C 

2958 LYS389 O 

2959 LYS389 CB 

2960 LYS389 CG 

2961 LYS389 CD 

2962 LYS389 CE 

2963 LYS389 NZ 

2964 HIS390 N 

2965 HIS390 CA 

2966 HIS390 C 

2967 H)S390 O 

2968 HIS390 CB 



-Q 156 


-1 ft OQA 


4.7ol 




-1 7 nn^ 


^.046 


-ft PQ*^ 


-1ft ft/lQ 


0.534 




-1ft lOft 
-ID. l/£o 


A AO/% 

-0.032 




-17 <I4R 


A AAA 

-0.002 




- lo.yyo 


5 




- ID.I / / 


6.448 






7.194 


"O.OOO 


1 yi one 


8.429 




-1 7.259 


6.874 


-ft Oi Q 


- lO.OD 1 


6.111 


-f .ODl 


-19.079 


6.248 


-0.017 


-19.462 


7.349 


o ooc 


-19.256 


5.205 


-D.4UO 


-13.81 


6.463 


"O.0o2 


-12.519 


7.093 


-7.266 


-11.953 


7.874 


-o.o42 


-11.71 


7.315 


-5.676 


-11.542 


5.996 


-4.348 


-11.943 


5.365 


-4.081 


-11.153 


4.091 


-3.204 


-11.773 


6.357 


-7.063 


-11.798 


9.173 


-8.132 


-11.39 


10,091 


-8.419 


-9.89 


10.047 


-7.84 


-9.095 


10.805 


-7.647 


-11.801 


11.445 


-6.191 


-12.224 


11.339 


-5.817 


-12.105 


9.873 


-9.31 4 


-9.528 


9.143 


-9.775 


-8.145 


9.012 


-10.688 


-7.79 


10.176 


-1 1 .522 


-8.597 


10.603 


-10.558 


-7.999 


7.709 


-9.7oo 


-8.343 


6.437 


O "70 

-o./o 


-7.498 


5.987 


-10.097 


-9.492 


5.721 


-o.u7o 


-7.809 


4.831 




-9.804 


4.565 


-O.OOl 


-o.9d3 


4.121 


-lU.O 


-0.599 


10.707 


-1 1 .OOh 


-D. l41 


11.792 


1 0 ftOft 


C CO 

-o.o2 


1 1 .203 




-4.00 1 


At\ A 

10.4 


in fti 1 


c lie 
-5.1 15 


4 A OA A 

12.633 


-11 A^Q 

- 1 1 .*foy 


-4./O0 


40 Q AT 

13.847 


-in ft77 


-o./do 


1 4.767 


-11.487 


-3.466 


16.023 


-10.719 


-2.637 


16.96 


-13.775 


-6.068 


11.571 


-15.055 


-5.523 


11.102 


-15.385 


-4.226 


11.836 


-15.845 


-4.213 


12.983 


-16.162 


-6.548 


11.316 
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2969 


HIS390 


CG 


-17.525 


-6.094 


10 826 


2970 


HIS390 


ND1 


-17.895 


-5.893 


9 54S 

».0*tO 


2971 


HIS390 


CD2 


-18.62 


-5.81 


11 607 

1 i .w/ 


2972 


HIS390 


CE1 


-19.181 


-5.487 


9 S11 


2973 


HIS390 


NE2 


-19.629 


-5.437 




2974 


ASP391 


N 


-15.053 


-3.138 


11 1fi7 


2975 


ASP391 


CA 


-15.269 


-1 .789 


11 ftft*^ 
1 1 .000 


2976 


ASP391 


c 


-1 5.392 


-0 871 


10 4ft 


2977 


ASP391 


0 


-14.395 


-0 539 


Q ft^^(^ 


2978 


ASP391 


CB 


-14 068 


-1 414 


19 


2979 


ASP391 


CG 


-14 172 




1*5 1ft 
10. lo 


2980 


ASP391 


0D1 


-14. QfKd 


n 771 


1 9 7n7 


2981 


ASP391 


OD2 


-1 941 
1 0.^'r 1 


n ^594 


1Q QftA 


2982 


SER392 


N 


-1R c*ft9 


-U.oOO 


in OQQ 


2983 


SER392 


CA 


1 o.ooo 




y.U/i> 


2984 


SER392 


c 


-1ft 9ft1 


1 ftftQ 


y.i^y 


2985 


SER392 




- 1 O.S79 




O-07 


2986 


SER392 






A KQ7 
V»OOf 


O.OOO 


2987 


SER392 




-lO.O/D 


l.OOO 


A 0~70 

9.o7o 


2988 


THR393 


In 




O QOO 
iC.09^ 


1 \j.d!Q7 


2989 


THR393 


CA 


-1 R A<^1 


O./DO 


1 u.oi 0 


2990 


THR393 


c 


- 1 0,%jO i 


Q ftOO 


iA yiaO 

1 U.4DO 


2991 


THR393 






yl 700 


y.yoy 


2992 


THR3Q3 


CR 


- ID. loo 


4.00 


11.411 


2993 




wVJ 1 


-J / .DO/ 


4.00I 


11.04 


2994 


THR393 


CG? 


-1 *; 7n*^ 


C Q70 

o.y/^ 


4 H COD 


2995 


ILE394 




- lO.ODD 


ii.O lO 


■A A -f -4 0 
11.1 I0 


2996 


ILE394 


CA 


i -f ftOQ 

-1 I .oyy 


id.DOO 


1 1.2O0 


2997 


ILE394 




-11 <;9^ 


1 OAA 


^A AO 


2998 


ILE394 


o 


-11 9n*\ 


U.41 y 


1 1 .0^4 


2999 


ILE394 


CB 


-1 1 "^7 


Q not; 
o.uyo 


•i 0 C77 


3000 


ILE394 


CGI 


-1 1 744 
-II./ *f *f 


4. OOO 


HO AO^ 

lii.y/i I 


3001 


ILE394 


CGP 


-Q ft47 


O Q7fi 

^.y/o 


•10 CO>l 

1 id.D24 


3002 


ILE394 


CD1 


-in Q77 


1^ CC90 

D.D^y 


4 0 ACC 


3003 


TYR395 

III iv^ww 


N 


-11 TO 


U.OD4 


y.o4y 


3004 


TYR395 


CA 


-11 90 

1 1 .^C7 


-u.ouo 


0 -I7Q 

y.i / y 


3005 


TYR395 


c 


-Q 709 


-U-fOO 


0 OAO 


3006 


TYR395 


o 


-A QQ7 


n 14Q 
u. 1 Hy 


Q 447 


3007 


TYR395 


CB 


-11 747 


.A C70 


7 791 


3008 


TYR395 


CG 


-11 7ft4 


-1 Qft7 


7 ini 


3009 


TYR395 


GDI 


-10.958 


-9 979 


ft 09ft 


3010 


TYR395 


CD2 


-12.648 


-9 Q97 


7 R19 

f .O 1^ 


3011 


TYR395 


CE1 


-10.991 


-3.543 




3012 


TYR395 


CE2 


-12.682 


-4 199 


7 0^9 


3013 


TYR395 


CZ 


-11.852 


-4 509 

*r.O\/b 


^ Qft9 

9.90^ 


3014 


TYR395 


OH 


-11.882 


-5.763 


*> 497 

0.*T^/ 


3015 


GLY396 


N 


-9.433 


-2.053 


9.421 


3016 


GLY396 


CA 


-8.007 


-2.401 


9.468 


3017 


GLY396 


C 


-7.732 


-3.871 


9.759 


3018 


GLY396 


O 


-8.601 


-4.74 


9.609 


3019 


LEU397 


N 


-6.493 


-4.132 


10.134 


3020 


LEU397 


CA 


-6.031 


-5.497 


10.406 


3021 


LEU397 


C 


-5.334 


-5.577 


11,751 


3022 


LEU397 


0 


-4.297 


-4.938 


11.961 
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3023 


LEU397 


CB 


-5,051 


-5.894 


9 311 


3024 


LEU397 


CG 


-5.773 


-6.502 


8.12 


3025 


LEU397 


GDI 


-5.037 


-6.225 


6 899 


3026 


LEU397 


CD2 


•5.979 


-7.996 


8 396 


3027 


HIS398 


N 


-5.87 


-6.402 


1 2 fi'^4 

1 C.OO*T 


3028 


HIS398 


CA 


-5,274 


-6.514 


13 987 


3029 


HIS398 


c 


-4.348 


-7.718 


14 107 

I IV// 


3030 


HIS398 


o 


-3.651 


-7.848 


1*5 19 


3031 


H1S398 


CB 


-6.363 


-6.528 


I*? 033 


3032 


HIS398 


CG 


-6.737 


-5.14 


IS <;9<) 


3033 


HIS398 


ND1 


-7.052 


-4,804 


18 70 


3034 


HIS398 


CD2 


-6.795 


-3 9ft4 


14 7ft1 


3035 


H1S398 


CE1 


-7.31 1 


-3 489 


18 AR1 


3036 


HIS398 


NE2 


-7.152 


.9 97*5 


1 *i 807 


3037 


ALA399 


N 


-4 306 


-ft ^87 


1Q nQ4 


3038 


ALA399 


CA 


-3 343 


-57. 0/ 1 


1Q 1 O 


3039 


ALA3QQ 


w 


-9 7 


•Q 00*^ 


i 1 7KI? 


3040 


ALAS99 


o 




.in 014 

- 1 U.U I*!' 


in 70 A 


3041 






-4 014 


-10 Q'^fi 


lo.ooo 


3042 


LFIJ400 


IN 


t .ooo 


1 U.UU4 


1 1 7QA 


3043 


I PI 1400 


HA 


-O t?R7 




1 U.009 




i PI Mho 


d 
\y 


O T70 


-1 U.ODO 


in QOR 

1 U.Uoo 


3045 


LPU40n 




1 719 


- lU.l DO 


i i OR 
1 1 .OO 


3046 


1 FU400 


V_/tJ 


-0 '^07 


-0,0*fO 


Q QAfi 


3047 




WW 


U.O lO 




<3./00 


3048 


LFIJ400 






-Q A9A 




304Q 


1 FIMOO 








Q 1Q7 

o.iy/ 


3050 






0 A1 R 
U.O 1 o 


-i9 17A 


in QAI 






CA 


o Oft4 


19 Q 


in Q7A 


3052 


PRO401 


c 


C..OO 


-19 A99 


Q ft71 


3053 


PRO401 


o 




-1 949 


o A09 


3054 


PRO401 


CB 




.14 o-io 


11 9AQ 


3055 


PRO401 


CG 


O 1Q7 

v. 157/ 


-1 4 4ftft 


10 00*^ 


3056 


PRO401 


CD 


V/.Ov/O 


-1'^ Oft 


in aofi 


3057 


VAL402 


N 


4 074 

*r-0/ t 


-1 9 987 

1 c..£.0 / 


O 7ft 

y./o 


3058 


VAL402 


CA 


00ft 


-19 18*^ 


ft 


3059 


VAL402 


c 


W.OvyO 


-19 814 


ft QQft 
o.yyo 


3060 


VAL402 


o 


7.008 


-12 dR^ 


Q QQft 


3061 


VAL402 


CB 


5.194 


-1 0 793 


8 98 

0.^\J 


3062 


VAL402 


CGI 


3.968 


-10.185 


7 '593 


3063 


VAL402 


CG2 


5.553 


-9 84 


9 44 


3064 


THR403 


N 


6.772 


-13 729 


8 146 


3065 


THR403 


CA 


8.039 


-14.428 


8.342 


3066 


THR403 


c 


9.135 


-13.709 


7.571 


3067 


THR403 


o 


9.102 


-13.66 


6 335 


3068 


THR403 


CB 


7.888 


-15.853 


7.827 


3069 


THR403 


OG1 


6.715 


-16.403 


8.406 


3070 


THR403 


CG2 


9.077 


-16.723 


8.22 


3071 


TRP404 


N 


10.089 


-13.166 


8.298 


3072 


TRP404 


CA 


11.177 


-12.406 


7.66 


3073 


TRP404 


C 


12.136 


-13.344 


6.931 


3074 


TRP404 


0 


12.984 


-12.835 


6.21 


3075 


TRP404 


CB 


11.969 


-11.654 


8.719 


3076 


TRP404 


CG 


11.163 


-10.949 


9.79 
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3077 


TRP404 


GDI 


10.886 


-11.444 


11.043 


3078 


TRP404 


CD2 


10.559 


-9.637 


9.729 


3079 


TRP404 


NE1 


10.155 


-10.524 


11.721 


3080 


TRP404 


CE2 


9,943 


-9.428 


10.972 


3081 


TRP404 


CE3 


10.506 


-8.656 


8.749 


3082 


TRP404 


CZ2 


9.278 


-8.237 


11.225 


3083 


TRP404 


CZ3 


9.838 


-7,468 


9,009 


3084 


TRP404 


CH2 


9.226 


-7.257 


10.239 


3085 


TRP404 


OXT 


12.117 


-14.53 


7239 


3086 


HEM1 


FE 


-8.08 


12.05 


10 226 


3087 


HEM1 


NA 


-9.653 


12.085 


9 07ft 


3088 


HEM1 


C1A 


-10.7 


13 004 


Q 077 


3089 


HEM1 


C2A 


-1 1 .687 


12 681 


fi 11A 


3090 


HEM1 


C3A 


-11 292 


11 R2R 


7 '^fift 


3091 


HEM1 


C4A 


-10 019 


11 174 


ft 12Q 


3092 


HEM1 


CHB 




in 1 1*% 


7 AQQ 


3093 


HEM1 


C1B 




Q M 

\7.00 


A 1fti 
O.I Of 


3094 


HEM1 


NB 


-7 30fl 




Q 1ft2 


3095 


HEM1 


C4B 


-fi nftfi 


Q Qfi4 


57.00*1' 


3096 


HEM1 


C3B 




fi 


O.OUO 


3097 


HEM1 


C2B 




ft 771 
O./ / 1 


7 74ft 


3098 


HEM1 


CMB 


-7 A1R 

"/.HID 


7 7f\tz 

f m/OO 


fi fiAO 
O.OO^ 


3099 


HEM1 


CAB 




fi n^^i 


A f>Q1 


3100 


HEM1 


CBB 


-*T.*T*r 


7 nm 




3101 


HEM1 


CHC 




in 2Qfi 


in Q74 


3102 


HEM1 


C1C 




11 90<4 


1 I.OOO 


3103 


HEMI 


NC 


-fi '^IQ 




I I.OO*f 


3104 


HEM1 


C4C 


-fi 227 


12 fift7 


1 <i.*f ^D 


3105 


HEM1 


C3C 


-A Q2fi 


1 2 fi'^R 

1 C..QOD 


1 o.w^ 


3106 


HEM1 


C2C 




11 f^^fi 


1 ^.o 1 0 


3107 


HEM1 


CMC 




in 712 


12 R'^2 


3108 


HEM1 


CAC 


-4 4fi2 


1*^ 4"^^ 


14 nR*% 


3109 


HEM1 


CBC 


-3 4*52 


1*^ 2*^1 


14 Q'^fi 


3110 


HEM1 


CHD 


-7 061 


I'? ARR 


12 Q1 


3111 


HEM1 


C1D 


-fi 237 


14 2n'^ 


12 2Q2 


3112 


HEM1 


ND 


-8 777 


1*^ '^72 


11 1A 
1 1 . 1 0 


3113 


HEM1 


C4D 


-9 915 


1 4 .'^ 1 


in Qifi 


3114 


HEM1 


C3D 


-10.045 


15 413 


11 fiOfi 


3115 


HEM1 


C2D 


-9.006 


15334 


12 67*? 


3116 


HEM1 


CMD 


-8.71 


16.241 


Ifi ft44 


3117 


HEM1 


CAD 


-11.178 


16.421 


1 1 .802 


3118 


HEM1 


CBD 


-10.91 


17.624 


10.918 


3119 


HEM1 


CGD 


-12.079 


18.574 


10.862 


3120 


HEM1 


01D 


-13.198 


18.167 


11.204 


3121 


HEM1 


02D 


-11.889 


19-736 


10.477 


3122 


HEM1 


CHA 


-10.849 


14.026 


9.961 


3123 


HEM1 


CMA 


-12.005 


10.703 


6.498 


3124 


HEM1 


CAA 


-12.907 


13.51 


7.748 


3125 


HEM1 


CBA 


-14.087 


13.112 


8.645 


3126 


HEM1 


CGA 


-15.442 


13.596 


8.14 


3127 


HEM1 


01A 


-15.522 


14.131 


7.009 


3128 


HEM1 


02A 


-16.439 


13.4 


8.866 
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What is Claimed is ; 

1. An isolated nucleic acid sequence encoding qpotbilone B hydroxylase 
or a mutant or variant thereof. 

5 

2. The isolated nucleic acid sequence of cliaini 1 comprising SEQ ID NO: 
1, 30, 32, 34, 36, 37, 38, 39, 40, 41, 42, 60, 62, 64, 66, 68, 72 or 74. 

3. The isolated nucleic acid sequence of claim 1 comprising SEQ ID 

10 N0:1. 

4. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino add substitution in an active site of the epothilone B hydroxylase 
enzyme. 

15 

5. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substimtion at amino acid GLU31, ARG67, ARG88, JLE92, 
ALA93, VAL106, ILE130, ALA140. MET176, PHE190, GLU 231, SER294, 
PHE237, or ILE365 of SEQ ID N0:2. 

20 

6. The isolated nucleic acid sequence of claim 1 encoding a mutant with 
at least one amino acid substitution at amino acid LEU39, GLN43, ALA45, MET57, 
LEU58, H1S62, PHE63. SER64, SER65, ASP66. ARG67, GLN68, SER69, LEU74, 
MET75, VAL76, ALA77. ARG78, GLN79, ILE80, ASP84, LYS85, PR086. PHE87, 

25 ARG88, PR089, SER90, LEU91. ILE92, ALA93, MET94, ASP95, HIS99, ARG103, 
PHEl 10, ILE155, PHE169, GLN170, CYS172, SER173, SER174, ARG175, 
MET176, LEU177, SBR178, ARG179, ARG186, PHE190, LEU193, VAL233, 
GLY234, LEU235, ALA236, PHE237, LEU238, LEU239, LEU240, ILE241, 
ALA242, GLY243. HIS244, GLU245, THR246, THR247, ALA248, ASN249, 

30 MET250, LEU283, THR287, ILE288. ALA289, GLU290, THR291, ALA292. 
THR293, SER294, ARG295, PHE296, ALA297, THR298, GLU312, GLY313, 
VAL314, VAL315, GLY316, VAL344, ALA345, PHE346, GLY347, PHE348, 
VAL350, fflS351, GLN352, CYS353, LEU354, GLY355, GLN356, LEU358, 
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ALA359, GLU362, LYS389, ASP391, SER392, 1101393, 1LB394, or TYR395 of 
SEQIDN0:2. 

7. The isolated nucleic acid sequence of claim 1 encoding a variant 
5 comprising SEQ ID NO:43, 44, 45, 46, 47, 48 or 49. 

8. A polypeptide encoded by the isolated nucleic acid sequence of claim 

1. 

10 9. An isolated nucleic acid molecule that is capable of hybridizing to a 

nucleic acid sequence of claim 2, or to the complementary sequence of said nucleic 
acid sequence, imder hybridization conditions of 3X SSC at 65^C for 16 hours, said 
isolated nucleic add molecule being capable of remaining hybridized to said nucleic 
acid sequence, or to the complementary sequence of said nucleic acid sequence, under 

15 wash conditions of 0.5X SSC, 55X for 30 minutes. 

10. An isolated polypeptide comprising SEQ ID NO:2. 

20 1 1. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
in an active site of epothilone B hydroxylase enzyme of SEQ ID NO:2. 

12. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 
25 ID NO:2 comprising an amino acid sequence with at least one amino acid substitution 
at amino acid GLU31, ARG67, ARG88, ILE92, ALA93, VAL106, ILE130, 
ALAMO, MET176, PHE190, GLU 231. SER294, PHE237, or ILE365 of SEQ ID 
NO:2. 

30 13. An isolated mutant polypeptide of epothilone B hydroxylase of SEQ 

ID NO:2 comprising an ammo acid sequence with at least one amino acid substitution 
at amino acid LEU39, GLN43, ALA45, MET57, LEU58, fflS62, PHE63, SER64, 
SER65, ASP66, ARG67, GLN68, SER69, LEU74, MET75, VAL76, ALA77, 
ARG78, GLN79, ILE80, ASP84, LYS85, PR086, PHE87, ARG88, PR089, SER90, 

35 LEU91, ILE92, ALA93, MET94, ASP95, HIS99, ARG103, PHEllO, ILE155, 
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10 



PHE169, GLN170, CYS172, SER173, SER174, ARG175, MET176, LEU177. 
SER178, ARG179, ARG186, PHE190, LEU193, VAL233, GLY234, LEU235, 
ALA236, PHE237, LEU238, LEU239, LEU240. ILE241, ALA242. GLY243, 
HIS244, GLU245, THR246, THR247, ALA248, ASN249, MET250, LBU283, 
THR287, E£288. ALA289. GLU290, TEIR291, ALA292, THR293, SER294, 
ARG295. PHE296, ALA297, THR298, GLU312, GLY313, VAL314, VAL315, 
GLY316, VAL344, ALA345, PHB346, GLY347, PHE348, VAL350, HIS351, 
GLN352, CYS353, LEU354, GLY355. GLN356, LEU358, ALA359. GLU362, 
LYS389, ASP391, SER392, THR393, ILE394, or TYR395 of SEQ ID NO:2. 

14. An isolated mutant polypeptide of epothilone B hydroxylase 
comprising SEQ ID NO: 31. 33. 35. 61. 63, 65, 67, 69, 71, 73 or 75. 

15. An isolated variant polypeptide of epothilone B hydroxylase 
15 comprising SEQ ID NO: 43, 44, 45, 46, 47, 48 or 49. 

16. An isolated nucleic acid sequence encoding a ferredoxin. 

17. The isolated nucleic acid sequence of claim 16 comprising SEQ ID 

20 N0:3. 

18. A polypeptide encoded by the isolated nucleic acid sequence of claim 

16. 

25 19. An isolated nucleic acid molecule that is capable of hybridizing to the 

nucleic acid sequence set forth in SEQ ID NO:3, or to the complementary sequence of 
the nucleic acid sequence set forth in SEQ ID N0:3, under hybridization conditions of 
3X SSC at 65*'C for 16 hours, said isolated nucleic acid molecule being capable of 
remaining hybridized to the nucleic acid sequence set forth in SEQ ID NO:3, or to the 

30 complementary sequence of the nucleic acid sequence set forth in SEQ ID NO:3, 
under wash conditions of 0.5X SSC, 55''C for 30 minutes. 

20. A vector comprising the isolated nucleic acid sequence of claim 1. 

35 21. The vector of claim 20 further comprising an isolated nucleic acid 

sequence encoding a feiredoxin. 
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22. A host cell comprising the vector of claim 20. 

23. A host cell comprising the vector of claim 21 . 

5 

24. A method for producing recombinant microorganisms which 
hydroxylate epothilones having a terminal alkyl group to produce epothilones having 
a ternoinal hydroxyalkyl group, said method conoprising transfecting a microorganism 
with the vector of claim 20 or 21. 

10 

25. A recombinantly produced microorganism that hydioxylates 
epothilones having a terminal alkyl group to produce epothilones having a terminal 
hydroxyalkyl group. 

15 26. The recombinantly produced microorganism of claim 25 wherein said 

microorganism expresses a nucleic acid sequence of SEQ ID NO: 1, 30, 32, 34, 36, 
37, 38. 39, 40, 41, 42, 60, 62, 64, 66. 68, 72 or 74. 

27. A method for the preparation of at least one epothilone of the 
20 following formula I 

HO.CH2-(AiV(Q)n,-(A2)o-E (£) 

where 

Ai and A2 are independently selected from the group of optionally substituted C1-C3 
alkyl and alkenyl; 

25 Q is an optionally substituted ring system containing one to three rings and at least 
one carbon to carbon double bond in at least one ring; 

n, m, and o are integers selected from the group consisting of zero and 1, where at 
least one of m or n or o is 1; and 
E is an epothilone core; 
30 comprising the steps of contacting at least one epothilone of the following formula n 

CH3-(AiV(Q)^-(A2)o-E (II) 
where Ai, Q, A2, E, n, m, and 0 are defined as above; 
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with a lecombinantly produced microorganism, or an enzyme derived therefrom, 
which is capable of selectively catalyzing the hydroxylation of Formula H, and 
effecting said hydroxylation. 

28. A method for the preparation of an epothilone analog of Formula A 




O OH O 



said method comprising biotransforming epothilone B to the epothUone analog of 
Formula A by incubation with a mutant epothilone B hydroxylase enzyme comprising 
10 SEQIDNO:31. 

29. A compound of Formula A 




15 or a pharmaceutically acceptable salt thereof. 

30. A homology model of qiothilone B hydroxylase having a root mean 
square deviation of conserved residue backbone atoms of less than about 4.0 A when 
superimposed on a corresponding backbone atoms described by structure coordinates 
20 listed in Appendix 1. 
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31. A method for producing a mutant with altered biological properties, 
function, yield of a desired product, rate of reaction, substrate specificity, or activity 
as compared to epothilone B hydroxylase, said method comprising the steps of: 
identifying an amino acid of SEQ ID N0:2 to mutate; and mutating the identified 

5 amino acid to create a mutant protein. 

32. The method of claim 3 1 wherein a homology model of ^thilone B 
hydroxylase having a root mean square deviation of conserved residue backbone 
atoms of less than about 4.0 A whoi si^iiiq>osed on a corresponding backbone 

10 atoms described by structure coordinates listed m Appendix 1 is used to identify an 
amino acid of SEQ ID NO: 2 to mutate. 

33. The method of claim 31 wherein the identified amino add is LEU39, 
GLN43, ALA45, MET57, LEU58. HIS62. PHE63, SER64. SER65, ASP66, ARG67. 

15 GLN68, SER69, LEU74, MBTVS. VAL76. ALA77, ARG78, GLN79, ILE80, ASP84, 
LYS85, PR086, PHE87, ARG88, PR089, SER90, LBU91, ILE92, ALA93, MET94, 
ASP95, HIS99, ARG103, PHEllO. ILE155, PHE169. GLN170, CYS172, SER173, 
SER174, ARG175, MET176, LEU177, SER178, ARG179, ARG186, PHE190, 
LEU193. VAL233, GLY234, LEU235, ALA236, PHE237. LEU238, LEU239, 

20 LEU240, 1LE241, ALA242, GLY243, fflS244, GLU245, THR246, THR247, 
ALA248, ASN249, MET250. LEU283, THR287, ILE288. ALA289, GLU290, 
THR291, ALA292. THR293, SER294, ARG295. PHE296, ALA297, THR298, 
GLU312, GLY313, VAL314, VAL315. GLY316, VAL344, ALA345, PHE346, 
GLY347, PHE348, VAL350, HIS351, GLN352, CYS353, LEU354, GLY355, 

25 GLN356, LEU358, ALA359, GLU362, LYS389, ASP391, SER392, THR393, 
ILE394, or TYR395 of SEQ ID N0:2. 



34. The method of claim 31 wherein the identified amino acid is GLU31, 
ARG67, ARG88. ILE92, ALA93, VAL106, ILE130, ALA140, MET176, PHE190, 
30 GLU 231, SER294, PHE237, or ILE365 of SEQ ID n6:2. 
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35. The method of claim 3 1 wherein the mutant protein improves yield of 
a desired product as compared to the yield of a desired product obtained using 
epothilone B hydroxylase. 

5 36. The method of claim 35 wherein the desired product is epothilone F. 

37. The method of claim 3 1 wherein the mutant improves the rate of 
reaction as conq>ared to the rate of reaction using epothilone B hydroxylase. 

10 38. The method of claim 31 wherein the mutant exhibits altered substrate 

specificity as compared to substrate specificity of epothilone B hydroxylase. 

39. The method of claun 38 wherein amino acid SBR294 is mutated. 

15 40. The method of claim 3 1 wherein the mutant exhibits essentially 

similar biological activity or function to epothilone B hydroxylase. 

41. A machine-readable data storage medium comprisiug a data storage 
material encoded with stmcture coordinates set forth in Appendix 1. 

20 
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V5 



Alignment used to design primers P450-1* and P450-la'*' 

STMSUACB tcctcatcgccggccacgagac (SEQ ID NO: 5) 

STMSUBCB tgctggtcgccggccacgagac (SEQ ID NO: 6) 

3702259 tgctcatcaccggccaggacac (SEQ ID NO: 7) 

SSU65940 . --ctgttcgccgggcacgactc (SEQ ID NO: 8) 

STMOLEP tgctcatcgcgggccacgagac (SEQ ID NO: 9) 

SERCP450A tgctggtcgccgggcacgagac (SEQ ID NO: 10) 

Alignment used to design primers P450-2'*' and P450-2' 

STMSUACB cggcgcggtggaggaactgct (SEQ ID NO: 11) 

STMSUBCB gggcgccgtcgaggagctgct (SEQ ID NO: 12) 

3702259 ccgcaccctggaggagctgct (SEQ ID NO: 13) 

SSU65940 cggcgcggtcgaggagatgct (SEQ ID NO: 14) 

STMOLEP cgcggcggtggaggagatgct (SEQ ID NO: 15) 

SERCP450A cggcgcgatcgaggagaccct (SEQ ID NO: 16) 

Alignment used to design primer P450-3" 

STMSUACB ttcggcttcggcgtgcaccagtgcctgggc (SEQ ID NO: 17) 

STMSUBCB ttcggcttcggcgtccaccagtgcctggga (SEQ ID NO: 18) 

3702259 ttcggctggggcGcccaccactgcctgggc (SEQ ID NO: 19) 

SSU65940 ttcggtcacggcgtccacaagtgtcctggc (SEQ ID NO:20) 

STMOLEP ttcgggcacggagcgcaccactgcatcggc (SEQ ID NO: 21) 

SERCP450A ttcggccacggcatccacttctgcgtgggc (SEQ ID NO: 22) 
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BPO-B MTDVSETTATLPIiARKCPFSPP- -PEYERLRRESPVSRVGLPSGQTAWALTRLEDXREML 

1 JINA -ATVPDIiESDSPHVDWYSTYMLRETAPVTPVRPL-GQDAWLVTGYDE^^ 

**;* .* . * .**. :**: * : *♦**•* :: : * 

EPO -B SSPHFSSD- -RQSPSPPLMVARQI - -RREDKP- PRPSLIAMDPPEHGKARRDWGEPTVK 

IJINA SDIiRLSSDPKKKyPGVEVEFPAyiiGPPEDVRNYPATlM3TSDPPTHT^ 

*. ::*** :: *.. : : *** * : *: * ****.. 

EPO-B RMKALQPRIQQIVDEHIDALLAGPKPADIiVQALSIiPVPSLVICELIiGVPYSDHEFFQSCS 
IJINA RVEAMRPRVEQITAELLDEV-GDSGVVDIVDRPAHPLPIKVICELLGVDEAARGAPGRWS 
*:;*::**::**. * :* : ... .*:*: :: *:* ******** : : * * 

EPO-B SRMLSREW-AEERMTAPBSLENyXDELVTKKEANATEDDLLGRQILKQRESGEADHGEL 
IJINA SEILVMDPERAEQRGQAAREWNFIIiDLVERRRTEPGDDIiLSALISVQDDDDGRLSADEL 

*.:* . **.* * *.. ;*♦ .j^j.^ .* * _ ... _ ^** 

EPO-B VGIiAFLLLIAGHETTANMISLGTVTLLENPDQLAOKADPGKT 

IJINA TSIALVLLLAGFEASVSLIGIGTYLIiLTHPDQIiALVRADPSALPNAVEEIIJiyiAP 

..:*::**:**.*::..:*.:** ** :***** ;:***. *.**.** ,**. 

EPO-B TSRFATADVEIGGTLIRAGBGWGIiSNAGNHDPDGPENPDTPDIERGARHHVAFGFGVHQ 
IJINA T-RFAAEEVEIGGVAIPQYSTVIjVANGAANRDPSQFPDPHRFDVTRDTRGHLSPGQ^ 

* ***; ****** ^ it ^ *. ^^*^*-**, * .*,**. *.:* *j:** *:* 

EPO-B CLGQNLARLELQIOTOTLFRRVPGIRIAVPVDELPPKHDSTIYGLHALPVTW- - 

IJINA CMGRPLAKLEGEVALRALFGRFPALSLGIDADDVVWRRSLIiLRGIDHLPVRLTC 
*;*- **.** ;** ♦.*,: ,*•• : *;. *** 

REG. 3 
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SEQUENCE LISTING 
<110> Bristol-Myers Squibb Company 

<120> COMPOSITIONS AND METHODS FOR HYDROXYIATING EPOTHIXiONBS 

<130> D0231 PCT2 

<150> US 10/321,188 

<151> 2002-12-17 

<160> 76 

<170> PatentIn version 3.1 

<210> 1 
<211> 1186 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 1 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcgtt cctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 
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gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacg 1186 



<210> 2 
<211> 404 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 2 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



2 



wo 2004/061116 



PCTAJS2003/034082 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



■ Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 
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<210> 3 

<211> 195 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 3 

atgaagatca tcgcggacac cgggaagtgc gtgggggcgg gccagtgcgt gctcaccgat 60 
cccgatctgt tcgaccagag cgaggacgac gggacggtcc tcctgctgaa cgccgagccc 120 
gaaggcgaag aggcggagga gaacgcgcgc accgccgtgc acatctgccc ggggcaggca 180 
ctttcgctcg cgtag 195 



<210> 4 
<211> 64 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 4 

Met Lys He He Ala Asp Thr Gly Lys Cys Val Gly Ala Gly Gin Cys 
15 10 15 



Val Leu Thr Asp Pro Asp Leu Phe Asp Gin Ser Glu Asp Asp Gly Thr 
20 25 30 



Val Leu Leu Leu Asn Ala Glu Pro Glu Gly Glu Glu Ala Glu Glu Asn 
35 40 45 



Ala Arg Thr Ala Val His He Cys Pro Gly Gin Ala Leu Ser Leu Ala 
50 55 60 



<210> 5 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 5 

tcctcatcgc cggccacgag ac 22 



<210> 6 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 
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<223> Synthetic 
<400> 6 

tgctggtcgc cggccacgag ac 22 

<210> 7 

<211> 22 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 7 

tgctcatcac cggccaggac ac 22 



<210> 8 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<22C> 

<223> Synthetic 

<400> 8 

ctgttcgccg ggcacgactc 20 



<210> 9 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 9 

tgctcatcgc gggccacgag ac 22 



<210> 10 

<211> 22 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 10 

tgctggtcgc cgggcacgag ac 22 



<210> 11 
<211> 21 
<212> DNA 
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<213> 



Artificial sequence 



<220> 
<223> 



Synthetic 



<400> 11 

cggcgcggtg gaggaactgc t 



21 



<210> 12 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 12 

gggcgccgtc gaggagctgc t 21 

<210> 13 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 14 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 14 

cggcgcggtc gaggagatgc t 21 

<210> 15 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 15 

cgcggcggtg gaggagatgc t 21 



<400> 13 

ccgcaccctg gaggagctgc t 



21 
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<210> 16 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 16 

cggcgcgatc gaggagaccc t 



<210> 17 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 17 

ttcggcttcg gcgtgcacca gtgcctgggc 

<210> 18 

<211> 30 

<212> DNA 

<213> Artificial seqiience 
<220> 

<223> Synthetic 

<400> 18 

ttcggcttcg gcgtccacca gtgcctggga 



<210> 19 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 19 

ttcggctggg gcccccacca ctgcctgggc 



<210> 20 

<211> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Synthetic 

<400> 20 



sequence 
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ttcggtcacg gcgtccacaa gtgtcctggc 



30 



<210> 21 

<211> 30 

<212> DMA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 21 

ttcgggcacg gagcgcacca ctgcatcggc 30 



<210> 22 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 23 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 23 

tgctgctsdt cgccggbcab gasac 25 



<210> 24 

<211> 25 

<212> UNA 

<213> Artificial sequence 

<220> 

<223> Synthetic 
<220> 

<221> misc_feature 

<222> (9).. (9) 

<223> n=a/ Q, g or t 



<400> 22 

ttcggccacg gcatccactt ctgcgtgggc 



30 



<400> 24 

tgmtssysnt cgscgsbcay gasac 



25 
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<210> 25 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 25 

cggvgcsvts gaggarmtgc tgcg 24 



<210> 26 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 27 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<40Q> 27 

gcccaggcas ahcacsywg gcdybggctt 30 

<210> 28 

<211> 27 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 29 

<211> 27 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 29 



<400> 26 

cgcagcakyt cctcsabsgc bccg 



24 



<400> 28 

gcgagatcta cctggggaag gacaacc 



27 
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gcgaagctta cggacttgga ccctacg 27 

<210> 30 
<211> 1215 
<212> DKA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 30 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agagctatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcgtt cttgctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgaccc cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 31 
<211> 404 
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<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 31 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 .135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Ser Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
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195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Pro Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu lie Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> .32 
<211> 1215 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 32 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgtac gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc cgcctggtcg gtctggcgtt cctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 



<210> 33 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
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<400> 33 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu . Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu He Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Glri Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Tyr Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 
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Gly Glu Ala Asp His Gly Arg Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 34 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
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<400> 34 
atgaccgacg 


tcgaggaaac 


caccgcgacc ttgccactgg cccgcaaatg cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 


180 


ccgcatttca 


gctccgaccg 


gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 


240 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc accgtcaagc gcatgaaagc gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc tcgccggccc caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggatgct cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 35 
<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 35 

Met Thr Asp Val Glu 61u Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 
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Cys Pro Phe Ser Pro Pro Pro Glu Tyx Qlu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
il5 120 125 



His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 ' 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 
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He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 

Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 

Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 

Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 36 
<211> 1104 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 36 

gcgaccttgc cgctggcccg caaatgcccg ttttcaccgc cgcccgaata cgagcggctt 60 

cgccgggaaa gtccggtttc ccgggtcggt ctcccgtccg gtcaaaccgc ttgggcgctc 120 

1 

acccggctcg aggacatccg cgaaatgctg agcagtccgc atttcagctc cgaccggcag 180 
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agtccgtcgt 


tcccgctgat 


99tggcccgg cagatccggc gcgaggacaa gccgttccgc 


240 


ccgtccctca 


tcgcgatgga 


cccgccggaa cacagcaagg ccaggcgtga cgtcgtcggg 


300 


gaattcaccg 


tcaagcgcat 


gaaagcgctt cagccgcgta ttcagcagat cgtcgacgag 


360 


catatcgacg 


ccatgctcgc 


cggccccaaa cccgccgatc tcgtccaggc gctttccctg 


420 


ccggttccgt 


ccttggtgat 


ctgcgaactg ctcggtgtcc cctattcgga ccacgagttc 


480 


ttccagtcct 


gcagttcccg 


gat get cage cgggaagtca ccgccgaaga acggatgacc 


540 


gcgttcgagt 


cgctcgagaa 


ctatctcgac gaactcgtca cgaagaagga ggcgaacgcc 


600 


accgaggacg 


acctcctcgg 


ccgccagatc ctgaagcagc gcgaaacggg cgaagccgac 


660 


cacggcgaac 


tcgtcgggct 


ggcgttcctg ctgctcatcg cgggacacga gacgacggcg 


720 


aacatgatct 


cgctcggcac 


ggcgaccctg ctggagsiacc ccgaccagct ggcgaagatc 


780 


aaggccgatc 


CQQCf c aaaac 


cctcgccgcg atcgaggagc tcctgcgggt cttcaccatc 


840 


gcggagacgg 


cgacctcacg 


cttcgccacg gcggacgtcg agatcggcgg cacgctcatc 


900 


cgcgcgggtg 


aaggcgtcgt 


cggcctgagc aacgcgggca accacgatcc ggaaggcttc 


960 


gagaacccgg 


acgccttcga 


catcgaacgc ggcgcgcggc accacgtcgc cttcggattc 


1020 


ggtgtgcacc 


aatgcctcgg 


ccagaacttg gcgaggttgg aactccagat cgtgttcgat 


1080 


acgttgttcc 


ggcgagtgcc 


gggc 


1104 



<210> 37 
<211> 1103 

<212> DNA 

<213> Amycolatopsis orientalis 
<400> 37 

gaccttgccg ctggcccgga aatgcccgtt ttcgccgccg cccgaatacg aacggcttcg 60 
Gcgggaaagt ccggtttccc gggtcggtct cccgtccggt caaacggctt gggcgctcac 120 
ccggctcgaa gacatccgcg aaatgctgag cagcccgcat ttcagttccg accggcagag 180 
cccgtcgttc ccgctgatgg tcgcgcggca gatccgccgc gaggacaagc cgttccgccc 240 
ctccctcatc gcgatggatc cgccggaaca cagccgggcc aggcgtgacg tcgtcgggga 300 
attcaccgtc aagcggatga aggcgctcca gccgcgaatt cagcagatcg tcgacgaaca 360 
tctcgacgcc ctgctcgcgg gccccaaacc cgccgatctc gtccaggcgc tttccctgcc 420 
cgttccctcg ctggtgatct gcgaactgct cggcgtcccc tattcggacc acgagttctt 480 
ccagtcctgc agttccagga tgctcagccg ggaggtcacc gccgaagaac ggatgaccgc 540 
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gttcgagcag ctcgaaaact atctcgacga actggtcacc aagaaggagg cgaacgccac 600 

cgaggacgac ctcctcggcc gtcagatcct gaaacagcgg gaaacgggcg aggccgacca 660 

cggtgaactc gtcgggctgg cgttcctgct gctcatcgcc ggacacgaga ccacggcgaa 720 

catgatctcg ctcggcacgg tgaccctgct ggagaatccc gatcagctcg cgaagatcaa 780 

ggcagaccGC ggcaagaccc tcgccgccat cgaggaactc ctgcgggtct fccacgatcgc 840 

ggaaacggcg acctcacgct tcgccacggc ggacgtcgag atcggcggaa cgctgatccg 900 

cgcgggggaa ggggtggtgg gcctgagcaa cgcgggcaac cacgatccgg acggcttcga 960 

gaacccggac accttcgaca tcgaacgcgg cgcgcggcat cacgtcgcgt tcggattcgg 1020 

ggtgcaccag tgtctcggcc agaacttggc gaggttggaa ctccagatcg tcttcgatac 1080 

gttgttccgg cgagtgccgg gcc xioz 

<210> 38 
<211> 817 

<212> DNA 

<213> Amycolatopsis orientalis 
<400> 38 

cttcacccgc gcggatgagc gtgccgccga tctcgacgtc cgccgtggcg aagcgtgagg 60 

tcgccgtctc cgcgatggtg aagatccgca ggagttcctc gatcgcggcg agggtcttgc 120 

ccggatccgc cttgatcttc gccagctgat cggggttctc cagcagggtc accgtgccga 180 

gcgagatcat gttcgccgta gtctcgtgcc ccgcgatgag caggaggaac gccagaccga 240 

ccagttcgcc gtggtcggct tcgccggatt cgcgctgctt caggatctgg cggccgagga 300 

ggtcgtcctc ggtggcgttc gcctccttct tcgtgacgag ttcgtcgaga tagttctcga 360 

gcgactcgaa cgcggtcatc cgttcttcgg cggtgacttc ccggctgagc atccgggaac 420 

tgcaggactg gaagaactcg tggtccgaat aggggacacc gagcagttcg cagatcacca 480 

aggacggaac cggcagggaa agcgcctgga cgagatcggc gggtttgggg ccggcgagca 540 

Sggcgtcgat atgctcgtcg acgatctgct gaatacgtgg ctgaagcgct ttcatgcgct 600 

tgacggtgaa ttccccgacg acgtcacgcc tggccttgcc gtgttccggc gggtccatcg 660 

cgatgaggga cgggcggaac ggcttgtcct cgcgccggat ctgccgcgcc accatcagcg 720 

ggaacgacgg actctgccgg tcggagctga aatgcggact gctcagcatt tcgcggatgt 780 

cttcgagccg ggtgagcgcc caagcggttt gaccgga 817 

<210> 39 



20 
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<211> 1105 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 39 

ccgcgacctt gccgctggcc cgcaaatgcc cgttttcacc gccgcccgaa tacgagcggc 60 

ttcgccggga aagtccggtt tcccgggtcg gtctcccgtc cggtcaaacc gcttgggcgc 120 

tcacccggct cgaggacatc cgcgaaatgc tgagcagtcc gcatttcagc tccgaccggc 180 

agagtccgtc gttcccgctg atggtggccc ggcagatccg gcgcgaggac aagccgttcc 240 

gcccgtccct catctcgatg gacccgccgg aacacagcaa ggccaggcgt gacgtcgtcg 300 

gggaattcac cgtcaagcgc atgaaagcgc ttcagccgcg tattcagcag atcgtcgacg 360 

agcatatcga cgccctgctc gccggcccca aacccgccga tctcgtccag gcgctttccc 420 

tgccggttcc gtccttggtg atctgcgaac tgctcggtgt cccctattcg gaccacgagf 480 

tcttccagtc ctgcagttcc cggatgctca gccgggaagt caccgccgaa gaacggatga 540 

ccgcgttcga gtcgctcgag aactatctcg acgaactcgt cacgaagaag gaggcgaacg 600 

ccaccgagga cgacctcctc ggccgccaga tcctgaagca gcgcgaaacg ggcgaagccg 660 

accacggcga actggtcggg ctggcgttcc tcctgctcat cgcgggacac gagacgacgg 720 

cgaacatgat ctcgctcggc acggcgaccc tgctggagaa ccccgaccag ctggcgaaga 780 

tcaaggccga tccgggcaag accctcgccg cgatcgagga gctcctgcgg gtcttcacca 840 

tcgcggagac ggcgacctca cgcttcgcca cggcggacgt cgagatcggc ggcacgctca 900 

tccgcgcggg tgaaggcgtc gtcggcctga gtaacgcggg caaccacgat ccggaaggct 960 

tcgagaaccc ggacgccttc gacatcgaac gcggcgcgcg gcaccacgtc gccttcggat 1020 

tcggtgtgca ccaatgcctc ggccagaact tggcgaggtt ggaactccag atcgtgttcg 1080 

atacgttgtt ccggcgagtg ccggg 1105 

<210> 40 
<211> 1304 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 40 

ccttgccact ggcccgcaaa tgcccgtttt caccaccgcc cgaatacgag cggctccgcc 60 

gggaaagtcc ggtttcccgg gtcggtctcc cctccggtca aaccgcttgg gcgctcaccc 120 

ggctcgaaga catccgcgaa atgctgagca gtccgcattt cagctccgac cggcagagtc 180 

cgtcgttccc gctgatggtg gcgcggcaga tccggcgcga ggacaagccg ttccgcccgt 240 
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ccctcatcgc 


gatggacccg 


ccggaacacg gcaaggccag gcgtgacgtc gtcggggaat 


300 


tcaccgtcaa 


gcgcatgaaa 


gcgcttcagc cacgtattca gcagatcgtc gacgagcata 


360 


tcgacgccct 


gctcgccggc 


cccaaacccg ccgatctcgt ccaggcgctt tccctgccgg 


420 


ttccgtcctt 


ggtgatctgc 


gaactgctcg gtgtccccta ttcggaccac gagttcttcc 


480 


agtcctgcag 


ttcccggatg 


ctcagccggg aagtcaccgc cgaagaacgg atgaccgcgt 


540 


tcgagtcgct 


cgagaactat 


ctcgacgaac tcgtcacgaa gaaggaggcg aacgccaccg 


600 


aggacgacct 


cctcggccgc 


cagatcctga agcagcgcga atccggcgaa gccgaccacg 


660 


gcgaactggt 


cggtctggcg 


ttcctcctgc tcatcgcggg gcacgagact acggcgaaca 


720 


tgatctcgct 


cggcacggtg 


accctgctgg agaaccccga tcagctggcg aagatcaagg 


780 


cggatccggg 


caagaccctc 


gccgcgatcg aggaactcct gcggatcttc accatcgcgg 


840 


agacggcgac 


ctcacgcttc 


gccacggcgg acgtcgagat cggcggcacg ctcatccgcg 


900 


cgggtgaagg 


cgtcgtcggc 


ctgagcaacg cgggcaacca cgatccggac ggcttcgaga 


960 


acccggacac 


Gttcgacatc 


gaacgcggcg cgcggcatca cgtcgccttc ggattcggtg 


1020 


tgcaccaatg 


cctcggccag 


aacttggcga ggttggaact ccagatcgtg ttcgatacgt 


1080 


tgttccggcg 


agtgccgggc 


atccggatcg ccgtaccggt cgacgaactg ccgttcaagc 


1140 


acgattcgac 


gatctacggc 


ctccgcgccc tgccggtcac ctggtaggag gagccatgaa 


1200 


gatcatcgcg 


gacaccggga 


agtgcgtggg ggcgggccag tgcgtgctca ccgatcccga 


1260 


tctgttcgac 


cagagcgagg 


acgacgggac ggtcctcctg ctga 


1304 



<210> 41 
<211> 825 

<212> DNA 

<213> Amycolatopsis oriental is 
<400> 41 

ctccggtcaa accgcttggg cgctcacccg gctcgaagac atccgcgaaa tgctgagcag 60 
tccgcatttc agctccgacc ggcagaatcc gtcgttcccg ctgatggtgg cgcggcagat 120 
ccggcgcgag gacaagccgt tccgcccgtc cctcatcgcg atggacccgc cggaacacag 180 
caaggccagg cgtgacgtcg tcggggaatt caccgtcaag cgcatgaaag cgcttcagcc 240 
gcgtattcag cagatcgtcg acgagcatat cgacgccctg ctcgccggcc ccaaacccgc 300 
cgatctcgtc caggcgcttt ccctgccggt tccgtccttg gtgatctgcg aactgctcgg 360 
tgtcccctat tcggaccacg agttcttcca gtcctgcagt tcccggatgc tcagccggga 420 
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agtcaccgcc gaagaacgga tgaccgcgtt cgagtcgctc gagaactatc tcgacgaact 480 

cgtcacgaag aaggaggcga acgccaccga ggacgacctc ctcggccgcc agatcctgaa 540 

gcagcgggaa acgggcgagg ccgaccacgg cgaactcgtc gggctggcgt tcctgctgct 600 

catcgccggg cacgagacga cggcgaacat gatctcgctc ggcacggcga ccctgctgga 660 

gaaccccgac cagctggcga agatcaaggc ggatccgggc aagaccctcg ccgcgatcga 720 

ggaactgctg cgcgtcttca cgatcgcgga gacggcgacc tcacgcttcg ccacggcgga 780 

cgtcgagatc ggcggcacgc tcatccgcgc gggtgaaggc gtcgt 825 

<210> 42 
<211> 1103 
<212> DNA 

<213> Amycolatopsis orientalis 
<400> 42 

gcgaccttgc cactggcccg caaatgcccg ttttcaccac cgcccgaata cgagcggctc 60 

cgccgggaaa gtccggtttc ccgggtcggt ctcccctccg gtcaaaccgc ttgggcgctc 120 

acccggctcg aagacatccg cgaaatgctg agcagtccgc atttcagctc cgaccggcag 180 

agtccgtcgt tcccgctgat ggtggcgcgg cagatccggc gcgaggacaa gccgttccgc 240 

ccgtccctca tcgcgatgga cccgccggaa cacggcaagg ccaggcgtga cgtcgtcggg 300 

gaattcaccg tcaagcgcat gaaagcgctt cagccacgta ttcagcagat cgtcgacgag 360 

catatcgacg ccctgctcgc cggccccaaa cccgccgatc tcgtccaggc gctttccctg 420 

ccggttccgt ccttggtgat ctgcgaactg ctcggtgtcc cctattcgga ccacgagttc 480 

ttccagtcct gcagttcccg gatgctcagc cgggaagtca ccgccgaaga acggatgacc . 540 

gcgttcgagt cgctcgagaa ctatctcgac gaactcgtca cgaagaagga ggcgaacgcc 600 

accgaggacg acctcctcgg ccgccagatc ctgaagcagc gcgaatccgg cgaagccgac 660 

cacggcgaac tggtcggtct ggcgttcctc ctgctcatcg cggggcacga gactacggcg 720 

aacatgatct cgctcggcac ggtgaccctg ctggagaacc ccgatcagct ggcgaagatc 780 

aaggcggatc cgggcaagac cctcgccgcg atcgaggaac tcctgcggat cttcaccatc 840 

gcggagacgg cgacctcacg cttcgccacg gcggacgtcg agatcggcgg cacgctcatc 900 

cgcgcgggtg aaggcgtcgt cggcctgagc aacgcgggca accacgatcc ggacggcttc 960 

gagaacccgg acaccttcga catcgaacgc ggcgcgcggc atcacgtcgc cttcggattc 1020 

ggtgtgcacc aatgcctcgg ccagaacttg gcgaggttgg aactccagat cgtgttcgat 1080 
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acgttgttcc ggcgagtgcc ggg 11 03 



<210> 43 
<211> 402 
<212> PRT 

<213> Amycolatopsis orient a lis 
<400> 43 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 



Pro Glu His Ser Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Met Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 
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Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu lieu Gly Arg Gin lie Leu Lys Gin Arg Glu Thr 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met lie Ser Leu Gly Thr Ala 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys lie Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala lie Glu Glu Leu Leu Arg Val Phe Thr lie 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Glu Gly Phe Glu Asn Pro Asp Ala Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val 
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<210> 44 

<211> 367 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 44 

Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr 
1 5 10 15 



Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser 
20 25 30 



Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met 
35 40 45 



Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro 
50 55 60 



Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro 
65 70 75 80 



Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Arg Ala Arg Arg Asp 
85 90 95 



Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg 
100 105 110 



lie Gin Gin He Val Asp Glu His Leu Asp Ala Leu Leu Ala Gly Pro 
115 120 125 



Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu 
130 135 140 



Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe 
145 150 155 160 



Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu 
165 170 175 



Arg Met Thr Ala Phe Glu Gin Leu Glu Asn Tyr Leu Asp Glu Leu Val 
180 185 190 



Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin 
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195 200 205 



lie Leu Iiys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu Val 
210 215 220 



Gly Leu Ala Phe Leu Leu Leu lie Ala Gly His Glu Thr Thr Ala Asn 
225 230 235 240 



Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu 
245 250 255 



Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu Glu 
260 265 270 



Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe Ala 
275 280 285 



Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu Gly 
290 295 300 



Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu 
305 310 315 320 



Asn Pro Asp Thr Phe Asp lie Glu Arg Gly Ala Arg His His Val Ala 
325 330 335 



Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu 
340 345 350 



Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly 
355 360 365 



<210> 45 
<211> 272 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 45 

Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
15 10 15 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
20 25 30 
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Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 



Pro Ser Leu He Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin He Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu 
180 185 190 

Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
195 200 205 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
210 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
225 230 235 240 



Glu Leu Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 
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Ala Thr Ala Asp Val Glu lie Gly Gly Thr Leu lie Arg Ala Gly Glu 
260 265 270 



<210> 46 
<211> 367 
<2i2> PRT 

<213> Amycolatopsis orientalis 
<400> 46 

Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
15 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu 
35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 60 



Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu lie Ser Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg lie Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
115 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin lie Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu lie Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met lie Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys lie Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
260 265 270 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Glu Gly Phe 
305 310 315 320 



Glu Asn Pro Asp Ala Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 47 
<211> 394 
<212> PRT 

<213> Amycolatopsis orientalis 
<400> 47 

Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu 
15 10 15 
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Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro Ser Gly 
20 25 30 



Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu Met Leu 
35 40 45 



Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu 
50 55 60 



Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser 
65 70 75 80 



Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg Asp Val 
85 90 95 



Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro Arg lie 
100 105 110 



Gin Gin lie Val Asp Glu His lie Asp Ala Leu Leu Ala Gly Pro Lys 
115 120 125 



Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser Leu Val 
130 135 140 



lie Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe Phe Gin 
145 150 155 160 



Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu Glu Arg 
165 170 175 



Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr 
180 185 190 



Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg Gin lie 
195 200 205 



Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu Val Gly 
210 215 220 



Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala Asn Met 
225 230 235 240 



He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala 
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245 250 255 



Lys lie Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala lie Glu Glu Leu 
260 265 270 



Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr 
275 280 285 



Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu Gly Val 
290 295 300 



Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe Glu Asn 
305 310 315 320 



Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val Ala Phe 
325 330 335 



Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu 
340 345 350 



Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro Gly He Arg 
355 360 365 



He Ala Val Pro Val Asp Glu Leu Pro Phe Lys His Asp Ser Thr He 
370 375 380 



Tyr Gly Leu Arg Ala Leu Pro Val Thr Trp 



385 


390 


<210> 


48 


<211> 


274 


<212> 


PRT 


<213> 


Amycolatopsis oriental is 


<400> 


48 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp He Arg Glu 
15 10 15 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Asn Pro Ser Phe 
20 25 30 



Pro Leu Met Val Ala Arg Gin He Arg Arg Glu Asp Lys Pro Phe Arg 
35 40 45 
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Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Ser Lys Ala Arg Arg 
50 55 60 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
65 70 75 80 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
85 90 95 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
100 105 110 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
115 120 125 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
130 135 140 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
145 150 155 160 



Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
165 170 175 



Gin He Leu Lys Gin Arg Glu Thr Gly Glu Ala Asp His Gly Glu Leu 
180 185 190 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
195 200 205 



Asn Met He Ser Leu Gly Thr Ala Thr Leu Leu Glu Asn Pro Asp Gin 
210 . 215 220 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
225 230 235 240 



Glu Leu Leu Arg Val Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
245 250 255 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
260 265 270 
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Gly Val 



<210> 49 

<211> 367 

<212> PRT 

<213> Amycolatopsis orientalis 

<400> 49 



Ala Thr Leu Pro Leu Ala Arg Lys Cys Pro Phe Ser Pro Pro Pro Glu 
15 10 15 



Tyr Glu Arg Leu Arg Arg Glu Ser Pro Val Ser Arg Val Gly Leu Pro 
20 25 30 



Ser Gly Gin Thr Ala Trp Ala Leu Thr Arg Leu Glu Asp lie Arg Glu 
35 40 45 



Met Leu Ser Ser Pro His Phe Ser Ser Asp Arg Gin Ser Pro Ser Phe 
50 55 60 



Pro Leu Met Val Ala Arg Gin lie Arg Arg Glu Asp Lys Pro Phe Arg 
65 70 75 80 



Pro Ser Leu lie Ala Met Asp Pro Pro Glu His Gly Lys Ala Arg Arg 
85 90 95 



Asp Val Val Gly Glu Phe Thr Val Lys Arg Met Lys Ala Leu Gin Pro 
100 105 110 



Arg He Gin Gin He Val Asp Glu His He Asp Ala Leu Leu Ala Gly 
115 120 125 



Pro Lys Pro Ala Asp Leu Val Gin Ala Leu Ser Leu Pro Val Pro Ser 
130 135 140 



Leu Val He Cys Glu Leu Leu Gly Val Pro Tyr Ser Asp His Glu Phe 
145 150 155 160 



Phe Gin Ser Cys Ser Ser Arg Met Leu Ser Arg Glu Val Thr Ala Glu 
165 170 175 



Glu Arg Met Thr Ala Phe Glu Ser Leu Glu Asn Tyr Leu Asp Glu Leu 
180 185 190 
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Val Thr Lys Lys Glu Ala Asn Ala Thr Glu Asp Asp Leu Leu Gly Arg 
195 200 205 



Gin He Leu Lys Gin Arg Glu Ser Gly Glu Ala Asp His Gly Glu Leu 
210 215 220 



Val Gly Leu Ala Phe Leu Leu Leu He Ala Gly His Glu Thr Thr Ala 
225 230 235 240 



Asn Met He Ser Leu Gly Thr Val Thr Leu Leu Glu Asn Pro Asp Gin 
245 250 255 



Leu Ala Lys He Lys Ala Asp Pro Gly Lys Thr Leu Ala Ala He Glu 
260 265 270 



Glu Leu Leu Arg He Phe Thr He Ala Glu Thr Ala Thr Ser Arg Phe 
275 280 285 



Ala Thr Ala Asp Val Glu He Gly Gly Thr Leu He Arg Ala Gly Glu 
290 295 300 



Gly Val Val Gly Leu Ser Asn Ala Gly Asn His Asp Pro Asp Gly Phe 
305 310 315 320 



Glu Asn Pro Asp Thr Phe Asp He Glu Arg Gly Ala Arg His His Val 
325 330 335 



Ala Phe Gly Phe Gly Val His Gin Cys Leu Gly Gin Asn Leu Ala Arg 
340 345 350 



Leu Glu Leu Gin He Val Phe Asp Thr Leu Phe Arg Arg Val Pro 
355 360 365 



<210> 


50 


<211> 


25 


<212> 


DNA 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<406> 


50 



aggaaaccac cgcgaccttg ccact 25 
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<210> 51 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 51 

accgaatccg aaggcgacgt gat go 25 

<210> 52 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 



<210> 53 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 53 

tgatcttcat ggctcctcct acc 23 



<210> 54 

<211> 35 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<221> misc_feature 

<222> (18).. (20) 

<223> n=a, c, g or t 



<400> 52 

cggaatgaat ccatccgcat acg 



23 



<400> 54 

gcgaagccga ccacggcnnn ctggtcggtc tggcg 



35 



<210> 55 
<211> 35 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<220> 

<22l> mis Cofeature 

<222> (16).. (18) 

<223> n=a, c. g or t 



<400> 55 

cgccagaccg accagimngc cgtggtcggc ttcgc 35 



<210> 


56 


<211> 


35 


<212> 


DMA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(14) (14) 


<223> 


nssa, G, g or t 


<400> 


56 



ggtcggtctg gcgnysctcc tgctcatcgc ggggc 35 



<210> 


57 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(22) . . (22) 


<223> 


n=a, c, g or t 


<400> 


57 



gccccgcgat gagcaggags mcgccagac cgacc 35 

<210> 58 

<211> 35 

<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Synthetic 
<220> 

<221> misc_feature 

<222> (17).. (17) 

<223> n=a, c, g or t 

<400> 58 

ggtcggtctg gcgttcnysc tgctcatcgc ggggc 35 



<210> 


59 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(19) . . (19) 


<223> 


n=a, C/ g or t 


<400> 


59 



gccccgcgat gagcagsmg aacgccagac cgacc 35 



<210> 60 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 60 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 
ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 
cgSFcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 
cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 
gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 
gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 61 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 61 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 

Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 

Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 

Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
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85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 
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Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly laa Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 62 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 62 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 
cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcggga tggacccgcc ggaacacggc 300 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 
cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 
gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 
gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 
gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 
gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 
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cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agaccgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 12i5 



<210> 63 

<211> 404 

<212> PRT 

<213> Artificial sequence 
.<220> 

<223> Synthetic 

<400> 63 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser lieu Val Gly Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 
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Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 
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Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 

Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin Thr Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly lie Arg lie Ala Val Pro Val Asp 
370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 



<210> 64 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 64 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 

ccaccgcGcg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgccgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag . 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggca gatccgggca agaccctcgc cgcgatcgag 840 
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gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 


65 


<211> 


404 


<212> 


PRT 


<213> 


Artificial 


<220> 




<223> 


Synthetic 


<400> 


65 



Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Ala Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 
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His lie Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 X40 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val lie Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Aan His Asp Pro Asp Gly Phe Glu Aen Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 
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Cys lieu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin lie Val Phe Asp 
355 360 365 

Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 



<210> 66 
<211> 1215 
<212> DMA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 66 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg ctcgcaaatg cccgttttca 60 

ccaccgcGcg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 

tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 

ccgcatttca gctccgaccg gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccacccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatatc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccggatgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 
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ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgtbg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 



<210> 67 
<211> 404 
<212> PRT 

<213> Artificial sequence 

<220> 

<223> Synthetic 
<400> 67 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe His Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 
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Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
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370 375 380 

Glu Leu Pro Phe Lys His Asp Ser Thr lie Tyr Gly Leu His Ala Leu 
385 390 395 400 

Pro Val Thr Trp 

<210> 68 

<211> 1215 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 68 



atgaccgacg 


tcgaggaaac 


caccgcgacc ttgccactgg cccgcaaatg cccgttttca 


60 


ccaccgcccg 


aatacgagcg 


gctccgccgg aaaagtccgg tttcccgggt cggtctcccc 


120 


tccggtcaaa 


ccgcttgggc 


gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 


180 


ccacatttca 




y^^yc*y^^^-y ut-yttcuugc t-yacyyuggc ycgycaycitc 


O Aft 


cggcgcgagg 


acaagccgtt 


ccgcccgtcc ctcatcgcga tggacccgcc ggaacacggc 


300 


aaggccaggc 


gtgacgtcgt 


cggggaattc accgtcaagc gcatgaaagc gcttcagcca 


360 


cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc tcgccggccc caaacccgcc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggatgct cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcgtt cctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gatccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 
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gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 
ccggtcacct ggtag 1215 



<210> 69 
<211> 404 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 69 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
15 10 15 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Lys Ser 
20 25 30 



Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 



Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Arg Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu lie Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin He Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 

150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Met 
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165 



170 



175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin lie Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Phe Leu Leu Leu 
225 230 235 240 



lie Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 
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Pro Val Thr Trp 



<210> 


70 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


raisc_feature 


<222> 


(20) . . (21) 


<223> 


n=a, c, g or t 



<400> 70 

gttccgcGcg tccctcgtcn nsatggaccc gccgg 35 



<210> 


71 


<211> 


35 


<212> 


DNA 


<213> 


Artificial sequence 


<220> 




<223> 


Synthetic 


<220> 




<221> 


misc feature 


<222> 


(15).. (16) 


<223> 


n=a, c, g or t 



<400> 71 

cctgcagttc ccggnnsctc agccgggaag tcacc 35 



<210> 72 
<211> 1215 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic 
<400> 72 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaga ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
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ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 

cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 

aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaagc gcttcagcca 360 

cgtattcagc agatcgtcga cgagcatacc gacgccctgc tcgccggccc caaacccgcc 420 

gatctcgtcc aggcgctttc cctgccggtt ccgtccttgg tgatctgcga actgctcggt 480 

gtcccctatt cggaccacga gttcttccag tcctgcagtt cccgggcgct cagccgggaa 540 

gtcaccgccg aagaacggat gaccgcgttc gagtcgctcg agaactatct cgacgaactc 600 

gtcacgaaga aggaggcgaa cgccaccgag gacgacctcc tcggccgcca gatcctgaag 660 

cagcgcgaat ccggcgaagc cgaccacggc gaactggtcg gtctggcggc gctcctgctc 720 

atcgcggggc acgagactac ggcgaacatg atctcgctcg gcacggtgac cctgctggag 780 

aaccccgatc agctggcgaa gatcaaggcg gacccgggca agaccctcgc cgcgatcgag 840 

gaactcctgc ggatcttcac catcgcggag acggcgacct cacgcttcgc cacggcggac 900 

gtcgagatcg gcggcacgct catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 960 

ggcaaccacg atccggacgg cttcgagaac ccggacacct tcgacatcga acgcggcgcg 1020 

cggcatcacg tcgccttcgg attcggtgtg caccaatgcc tcggccagaa cttggcgagg 1080 

ttggaactcc agatcgtgtt cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 1140 

gtaccggtcg acgaactgcc gttcaagcac gattcgacga tctacggcct ccacgccctg 1200 

ccggtcacct ggtag 1215 

<210> 73 
<211> 404 
<212> PRT 

<213> Artificial sec[uence 

<220> 

<223> Synthetic 
<400> 73 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 



Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 
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Thr Arg Leu Glu Asp lie Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 55 60 



Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin He 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg He Gin Gin. He Val Asp Glu 
115 120 125 



His Thr Asp Ala Leu Leu Ala Gly Pro Lys Pro Ala Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser . Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ala 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 
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Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu. Gin He Val Phe Asp 
355 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 74 
<211> 1215 
<212> DNA 

<213> Artificial sequence 

<220> 

<223> Synthetic 
<400> 74 

atgaccgacg tcgaggaaac caccgcgacc ttgccactgg cccgcaaatg cccgttttca 60 
ccaccgcccg aatacgagcg gctccgccgg gaaagtccgg tttcccgggt cggtctcccc 120 
tccggtcaaa ccgcttgggc gctcacccgg ctcgaagaca tccgcgaaat gctgagcagt 180 
ccgcatttca gctccgacca gcagagtccg tcgttcccgc tgatggtggc gcggcagatc 240 
cggcgcgagg acaagccgtt ccgcccgtcc ctcgtcgcga tggacccgcc ggaacacggc 300 
aaggccaggc gtgacgtcgt cggggaattc accgtcaagc gcatgaaggc gcttcagcca 360 
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cgtattcagc 


agatcgtcga 


cgagcatatc gacgccctgc tcgccggccc caaacccacc 


420 


gatctcgtcc 


aggcgctttc 


cctgccggtt ccgtccttgg tgatctgcga actgctcggt 


480 


gtcccctatt 


cggaccacga 


gttcttccag tcctgcagtt cccggtcgct cagccgggaa 


540 


gtcaccgccg 


aagaacggat 


gaccgcgttc gagtcgctcg agaactatct cgacgaactc 


600 


gtcacgaaga 


aggaggcgaa 


cgccaccgag gacgacctcc tcggccgcca gatcctgaag 


660 


cagcgcgaat 


ccggcgaagc 


cgaccacggc gaactggtcg gtctggcggc gctcctgctc 


720 


atcgcggggc 


acgagactac 


ggcgaacatg atctcgctcg . gcacggtgac cctgctggag 


780 


aaccccgatc 


agctggcgaa 


gatcaaggcg gacccgggca agaccctcgc cgcgatcgag 


840 


gaactcctgc 


ggatcttcac 


catcgcggag acggcgacct cacgcttcgc cacggcggac 


900 


gtcgagatcg 


gcggcacgct 


catccgcgcg ggtgaaggcg tcgtcggcct gagcaacgcg 


960 


ggcaaccacg 


atccggacgg 


cttcgagaac ccggacacct tcgacatcga acgcggcgcg 


1020 


cggcatcacg 


tcgccttcgg 


attcggtgtg caccaatgcc tcggccagaa cttggcgagg 


1080 


ttggaactcc 


agatcgtgtt 


cgatacgttg ttccggcgag tgccgggcat ccggatcgcc 


1140 


gtaccggtcg 


acgaactgcc 


gttcaagcac gattcgacga tctacggcct ccacgccctg 


1200 


ccggtcacct 


ggtag 




1215 



<210> 75 

<211> 404 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic 

<400> 75 

Met Thr Asp Val Glu Glu Thr Thr Ala Thr Leu Pro Leu Ala Arg Lys 
1 5 10 15 

Cys Pro Phe Ser Pro Pro Pro Glu Tyr Glu Arg Leu Arg Arg Glu Ser 
20 25 30 

Pro Val Ser Arg Val Gly Leu Pro Ser Gly Gin Thr Ala Trp Ala Leu 
35 40 45 

Thr Arg Leu Glu Asp He Arg Glu Met Leu Ser Ser Pro His Phe Ser 
50 " 55 60 
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Ser Asp Gin Gin Ser Pro Ser Phe Pro Leu Met Val Ala Arg Gin lie 
65 70 75 80 



Arg Arg Glu Asp Lys Pro Phe Arg Pro Ser Leu Val Ala Met Asp Pro 
85 90 95 



Pro Glu His Gly Lys Ala Arg Arg Asp Val Val Gly Glu Phe Thr Val 
100 105 110 



Lys Arg Met Lys Ala Leu Gin Pro Arg lie Gin Gin lie Val Asp Glu 
115 120 125 



His He Asp Ala Leu Leu Ala Gly Pro Lys Pro Thr Asp Leu Val Gin 
130 135 140 



Ala Leu Ser Leu Pro Val Pro Ser Leu Val He Cys Glu Leu Leu Gly 
145 150 155 160 



Val Pro Tyr Ser Asp His Glu Phe Phe Gin Ser Cys Ser Ser Arg Ser 
165 170 175 



Leu Ser Arg Glu Val Thr Ala Glu Glu Arg Met Thr Ala Phe Glu Ser 
180 185 190 



Leu Glu Asn Tyr Leu Asp Glu Leu Val Thr Lys Lys Glu Ala Asn Ala 
195 200 205 



Thr Glu Asp Asp Leu Leu Gly Arg Gin He Leu Lys Gin Arg Glu Ser 
210 215 220 



Gly Glu Ala Asp His Gly Glu Leu Val Gly Leu Ala Ala Leu Leu Leu 
225 230 235 240 



He Ala Gly His Glu Thr Thr Ala Asn Met He Ser Leu Gly Thr Val 
245 250 255 



Thr Leu Leu Glu Asn Pro Asp Gin Leu Ala Lys He Lys Ala Asp Pro 
260 265 270 



Gly Lys Thr Leu Ala Ala He Glu Glu Leu Leu Arg He Phe Thr He 
275 280 285 



58 



wo 2004/061116 



PCTAJS2003/034082 



Ala Glu Thr Ala Thr Ser Arg Phe Ala Thr Ala Asp Val Glu He Gly 
290 295 300 



Gly Thr Leu He Arg Ala Gly. Glu Gly Val Val Gly Leu Ser Asn Ala 
305 310 315 320 



Gly Asn His Asp Pro Asp Gly Phe Glu Asn Pro Asp Thr Phe Asp He 
325 330 335 



Glu Arg Gly Ala Arg His His Val Ala Phe Gly Phe Gly Val His Gin 
340 345 350 



Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Gin He Val Phe Asp 
355. 360 365 



Thr Leu Phe Arg Arg Val Pro Gly He Arg He Ala Val Pro Val Asp 
370 375 380 



Glu Leu Pro Phe Lys His Asp Ser Thr He Tyr Gly Leu His Ala Leu 
385 390 395 400 



Pro Val Thr Trp 



<210> 76 

<211> 404 

<212> PRT 

<213> Saccharopolyspora erythaea 

<400> 76 



Met Thr Thr Val Pro Asp Leu Glu Ser Asp Ser Phe His Val Asp Trp 
15 10 15 



Tyr Arg Thr Tyr Ala Glu Leu Arg Glu Thr Ala Pro Val Thr Pro Val 
20 25 30 



Arg Phe Leu Gly Gin Asp Ala Trp Leu Val Thr Gly Tyr Asp Glu Ala 
35 40 45 



Lys Ala Ala Leu Ser Asp Leu Arg Leu Ser Ser Asp Pro Lys Lys Lys 
50 55 60 



Tyr Pro Gly Val Glu Val Glu Phe Pro Ala Tyr Leu Gly Phe Pro Glu 
65 70 75 80 
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Asp Val Arg Asn Tyr Phe Ala Thr Asn Met Gly Thr Ser Asp Pro Pro 
85 90 95 



Thr His Thr Arg Leu Arg Lys Leu Val Ser Gin Glu Phe Thr Val Arg 
100 105 110 



Arg Val Glu Ala Met Arg Pro Arg Val Glu Gin lie Thr Ala Glu Leu 
115 120 125 



Leu Asp Glu Val Gly Asp Ser Gly Val Val Asp lie Val Asp Arg Phe 
130 135 140 



Ala His Pro Leu Pro lie Lys Val He Cys Glu Leu Leu Gly Val Asp 
145 150 155 160 



Glu Lys Tyr Arg Gly Glu Phe Gly Arg Trp Ser Ser Glu He Leu Val 
165 170 175 



Met Asp Pro Glu Arg Ala Glu Gin Arg Gly Gin Ala Ala Arg Glu Val 
180 185 190 



Val Asn Phe He Leu Asp Leu Val Glu Arg Arg Arg Thr Glu Pro Gly 
195 200 205 



Asp Asp Leu Leu Ser Ala Leu He Arg Val Gin Asp Asp Asp Asp Gly 
210 215 220 



Arg Leu Ser Ala Asp Glu Leu Thr Ser He Ala Leu Val Leu Leu Leu 
225 230 235 240 



Ala Gly Phe Glu Ala Ser Val Ser Leu He Gly He Gly Thr Tyr Leu 
245 250 255 



Leu Leu Thr His Pro Asp Gin Leu Ala Leu Val Arg Arg Asp Pro Ser 
260 265 270 



Ala Leu Pro Asn Ala Val Glu Glu He Leu Arg Tyr He Ala Pro Pro 
275 280 285 



Glu Thr Thr Thr Arg Phe Ala Ala Glu Glu Val Glu He Gly Gly Val 
290 295 300 

\ 
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Ala He Pro Gin Tyr Ser Thr Val Leu Val Ala Asn Gly Ala Ala Asn 
305 310 315 320 



Arg Asp Pro Lys Gin Phe Pro Asp Pro His Arg Phe Asp Val Thr Arg 
325 330 335 



Asp Thr Arg Gly His Leu Ser Phe Gly Gin Gly He His Phe Cys Met 
340 • 345 350 



Gly Arg Pro Leu Ala Lys Leu Glu Gly Glu Val Ala Leu Arg Ala Leu" 
355 360 365 



Phe Gly Ar^ Phe Pro Ala Leu Ser Leu Gly He Asp Ala Asp Asp Val 
370 375 380 



Val Trp Arg Arg Ser Leu Leu Leu Arg Gly lie Asp His Leu Pro Val 
385 390 395 400 



Arg Leu Asp Gly 
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